[Tinyos-devel] retransmissions in networking protocols
Andreas Koepke
koepke at tkn.tu-berlin.de
Sat Oct 6 05:24:35 PDT 2007
> On Oct 5, 2007, at 4:32 PM, Omprakash Gnawali wrote:
>
>>
>>> Om -- we have some detailed MLQI traces, right? Did we log duplicate
>>> suppression in them? Or, at the very least, we can definitely measure
>>> the duplicate delivery rate at the root, right? It would be good to
>>> know how bad the problem is.
>>>
>>> Phil
>>
>> This is data from Mirage, the sixth column tells you the number of
>> duplicate packets received from that origin and the sevent column
>> tells you what fraction of the total pkts received from the origin
>> were duplicates:
>>
>> #node total_sent uniq_rcv success_rate total_rcv repeated_rcv
>> (frac), minseq, maxseq
>> 3 369 369 1.000 370 1 0.003, 1, 369
>> 4 373 373 1.000 375 2 0.005, 1, 373
>> 5 368 367 0.997 381 14 0.037, 2, 369
>> 6 374 374 1.000 374 0 0.000, 1, 374
>> 9 368 368 1.000 369 1 0.003, 1, 368
>> 10 364 364 1.000 366 2 0.005, 1, 364
>> 13 366 366 1.000 367 1 0.003, 1, 366
>> 16 365 365 1.000 365 0 0.000, 2, 366
>> 17 372 372 1.000 372 0 0.000, 1, 372
>> 18 369 369 1.000 370 1 0.003, 2, 370
>> 19 352 232 0.659 233 1 0.004, 3, 354
>> 20 372 372 1.000 372 0 0.000, 2, 373
>> 21 372 294 0.790 296 2 0.007, 1, 372
>> 22 369 369 1.000 370 1 0.003, 1, 369
>> 24 367 358 0.975 364 6 0.016, 1, 367
>> 26 372 371 0.997 372 1 0.003, 1, 372
>> 27 362 342 0.945 344 2 0.006, 1, 362
>> 28 362 351 0.970 352 1 0.003, 1, 362
>> 30 369 369 1.000 369 0 0.000, 1, 369
>> 32 368 366 0.995 367 1 0.003, 1, 368
>> 33 375 375 1.000 375 0 0.000, 2, 376
>> 36 364 364 1.000 365 1 0.003, 2, 365
>> 37 360 360 1.000 360 0 0.000, 2, 361
>> 39 357 357 1.000 357 0 0.000, 2, 358
>> 40 358 358 1.000 359 1 0.003, 1, 358
>> 41 368 336 0.913 339 3 0.009, 2, 369
>> 43 371 371 1.000 375 4 0.011, 2, 372
>> 44 372 371 0.997 374 3 0.008, 1, 372
>> 47 365 362 0.992 362 0 0.000, 2, 366
>> 48 359 359 1.000 361 2 0.006, 2, 360
>> 49 373 263 0.705 266 3 0.011, 2, 374
>> 50 371 370 0.997 371 1 0.003, 2, 372
>> 52 365 356 0.975 357 1 0.003, 3, 367
>> 53 357 282 0.790 282 0 0.000, 2, 358
>> 55 359 359 1.000 360 1 0.003, 2, 360
>> 56 358 229 0.640 230 1 0.004, 3, 360
>> 57 367 367 1.000 368 1 0.003, 2, 368
>> 58 364 364 1.000 365 1 0.003, 2, 365
>> 59 364 349 0.959 351 2 0.006, 3, 366
>> 60 366 366 1.000 367 1 0.003, 2, 367
>> 61 362 362 1.000 385 23 0.060, 1, 362
>> 62 369 369 1.000 369 0 0.000, 1, 369
>> 66 360 360 1.000 364 4 0.011, 1, 360
>> 68 370 370 1.000 371 1 0.003, 1, 370
>> 69 367 367 1.000 368 1 0.003, 2, 368
>> 70 370 370 1.000 373 3 0.008, 2, 371
>> 71 365 364 0.997 365 1 0.003, 1, 365
>> 77 364 363 0.997 368 5 0.014, 1, 364
>> 78 365 365 1.000 366 1 0.003, 2, 366
>> 82 365 259 0.710 262 3 0.011, 3, 367
>> 85 367 352 0.959 353 1 0.003, 3, 369
>> 86 373 361 0.968 362 1 0.003, 2, 374
>> 89 370 370 1.000 371 1 0.003, 2, 371
>> 92 362 362 1.000 365 3 0.008, 3, 364
>> 95 379 379 1.000 381 2 0.005, 2, 380
>> 96 365 365 1.000 366 1 0.003, 3, 367
>> 100 367 367 1.000 368 1 0.003, 2, 368
>> 102 368 368 1.000 371 3 0.008, 1, 368
>> 104 375 375 1.000 379 4 0.011, 1, 375
>> 105 372 372 1.000 373 1 0.003, 2, 373
>> 107 363 357 0.983 359 2 0.006, 2, 364
>> 109 371 370 0.997 371 1 0.003, 2, 372
>> 112 368 366 0.995 367 1 0.003, 2, 369
>> 115 371 370 0.997 375 5 0.013, 2, 372
>> 116 369 368 0.997 372 4 0.011, 2, 370
>> 117 372 369 0.992 369 0 0.000, 2, 373
>> 119 360 352 0.978 352 0 0.000, 3, 362
>> 120 375 374 0.997 376 2 0.005, 2, 376
>> 122 368 352 0.957 353 1 0.003, 2, 369
>> 123 355 355 1.000 357 2 0.006, 2, 356
>> 124 368 368 1.000 372 4 0.011, 2, 369
>> 125 360 360 1.000 362 2 0.006, 2, 361
>> 130 371 371 1.000 373 2 0.005, 2, 372
>> 131 367 255 0.695 255 0 0.000, 2, 368
>> 132 364 339 0.931 339 0 0.000, 3, 366
>> 133 360 343 0.953 343 0 0.000, 3, 362
>> 137 378 370 0.979 371 1 0.003, 2, 379
>> 138 373 351 0.941 352 1 0.003, 2, 374
>> 140 366 348 0.951 351 3 0.009, 3, 368
>> 141 363 350 0.964 352 2 0.006, 2, 364
>> 142 377 313 0.830 314 1 0.003, 2, 378
>> 143 373 347 0.930 349 2 0.006, 2, 374
>> 145 376 346 0.920 347 1 0.003, 2, 377
>> 146 364 243 0.668 244 1 0.004, 2, 365
>> 148 369 344 0.932 344 0 0.000, 2, 370
>> #delivery 0.959233382475482
>> #throughtput 8.35853449121161
>> #time 3600.033
>>
>> There were a total of 161 duplicate packets received by the root out
>> of total 30091 pkts received.
>>
>> Looking at the duplicates in the network, lqi detected and suppressed
>> 1943 duplicates out of total 76606 message receptions in the network
>> (includes root). So, it seems about 2.5% packets are duplicates. Here
>> are the number of duplicate events logged by the nodes:
>>
>> node id #duplicates dropped
>> 68 806
>> 78 314
>> 6 240
>> 18 185
>> 26 147
>> 89 45
>> 112 43
>> 71 23
>> 102 23
>> 22 21
>> 37 19
>> 130 10
>> 50 9
>> 123 8
>> 120 7
>> 33 6
>> 125 5
>> 104 5
>> 96 3
>> 48 3
>> 133 3
>> 95 2
>> 5 2
>> 39 2
>> 115 2
>> 107 2
>> 70 1
>> 61 1
>> 57 1
>> 36 1
>> 3 1
>> 24 1
>> 16 1
>> 10 1
>>
>>
>> Then the question is if any node missed some duplicates due to cache
>> size limitation. I wrote a script that checked to see if the same
>> <origin, seq> were transmitted (FW_FWD_MSG) by a node within a 100-pkt
>> window, and there was not a single instance of missed duplicate due to
>> cache size limitation.
>
> Om: This is useful data. It looks like MLQI has a lower duplication
> rate (0.5%) than CTP (1%), which makes sense given its slower route
> adaptation. However, some nodes observe up to 3.7% duplication. My
> guess is that these are nodes whose next hop has heavy load so a
> larger chance of cache flush?
Cool data, mine is not as good -- though I did not put it into a table
as yet. I get some 5% to 10% of duplicates at the root node. Below 1% is
ok, but the app developer should be warned that the network is not
trying to be foolproof here. Ok, he should know that anyway ;-)
> Andreas: One thing about source-based suppression is that it requires
> an additional protocol field. Specifically, you need to be able to
> distinguish packets that are duplicates caused by retransmissions and
> those that are duplicates caused by routing loops. I.e., if you can
> quickly repair routing loops, then it can be OK to let a packet loop
> around a few times. In fact, you can take advantage of that data
> packet to detect and repair the loops. Joe Polastre argued for this
> approach pretty strongly in some early net2 meetings, and our
> experimental results back up his intuition.
I did a simple test and supressed the duplicate before the loop stuff was
done. I'm not going to do that a second time ;-) MLQI broke down
completely. It seems that MLQI is relying pretty strongly on this feature.
> In the case of CTP, this field is called THL (time has lived), and it
> increments on each hop. You only suppress a packet if the source ID,
> source sequence number, *and* THL match. Otherwise, it could be a
> looping packet. I can't recall the exact numbers, but dropping looped
> packets definitely prevents you from getting 3 nines of reliability,
> and it might prevent 2 nines.
Hmm -- MLQI has that field, or at least I thought so: hopcount.
> The reason why CTP is somewhat vulnerable to this is that it can
> switch routes really quickly. MLQI has a slower adaptation rate due
> to its 30s beacons. If you're seeing lots of duplicates, then I think
> it has to do with how you are emulating LQI. Don't forget that LQI on
> the CC2420, being a soft chip decision indicator is (very roughly)
> low-order bits of SNR when you are close to the SNR reception
> threshold, and is pegged when you have a high SNR. If you are using a
> more linear measure (e.g., direct SNR), then you are going to see a
> lot more variation than LQI for the CC2420 does, which will cause a
> much higher rate of route switchovers. One of the motivations for the
> white bit came from the observation that MLQI's scaling function
> essentially means that 95% of the links it uses are near-perfect
> ones: so why not just encode this information as a single bit?
I'm using SNR in [dB], it does not fluctuate too much -- that is the
beauty if you are the only user of the channel ;-) The routes are pretty
stable: The packets follow the same routes for hours. So I'm pretty sure
that it is not the parent switch overs, this is also what the packet
traces indicate. I will try to increase the MHOP history size, and see.
> Phil
>
>
>
>
More information about the Tinyos-devel
mailing list