[Tinyos Core WG] Re: CC2420 stack
Philip Levis
pal at cs.stanford.edu
Fri Feb 9 09:03:44 PST 2007
On Feb 8, 2007, at 8:19 PM, David Moss wrote:
> Just an update on the micaz radio rebooting node problem -
>
> The micaz radio node reboot problem occurs intermittently on
> receiver-only motes, but not transmit-only motes.
>
> I have an application (attached) located in tinyos-2.x-contrib/
> rincon/apps/tests/RadioReliabilityTest that demonstrates the
> issue. If you program one micaz up with that app with ID 0, it
> becomes a transmit-only node. Every other address is receive-only.
> By running 'listen', you can see when the mote reboots because a
> dummy packet is sent to serial on boot.
>
> Below is the rebooting output I'm seeing from a receive-only micaz,
> so you can see how intermittent the issue is. Note that receive-
> only telosb's never have this problem.
>
>
> Thu Feb 08 20:49:52 MST 2007:
> BOOT
>
> Thu Feb 08 20:50:06 MST 2007:
> BOOT
>
> Thu Feb 08 20:50:32 MST 2007:
> BOOT
>
> Thu Feb 08 20:50:58 MST 2007:
> BOOT
>
> Thu Feb 08 20:50:59 MST 2007:
> BOOT
>
> Thu Feb 08 20:51:37 MST 2007:
> BOOT
>
> Thu Feb 08 20:51:39 MST 2007:
> BOOT
>
> Thu Feb 08 20:51:53 MST 2007:
> BOOT
>
> Thu Feb 08 20:53:49 MST 2007:
> BOOT
>
> Thu Feb 08 20:55:42 MST 2007:
> BOOT
>
> Thu Feb 08 20:55:57 MST 2007:
> BOOT
>
> Thu Feb 08 20:56:30 MST 2007:
> BOOT
I've cc'd John Regehr, who's said that (once he gets the time) he'll
try to take a look at the problem with his bug-finding tools. A
student working with Dawson is also looking for TinyOS bugs, but is
using TOSSIM so low-level radio stack stuff won't be captured (and it
seems likely that's what this is).
Jung Il noted that this problem is much more pronounced when there
isn't address decoding.
The evidence so far points to a problem on the stack; either the
stack overflows or there's a corruption of the stack causing the
program to return to address 0 (the reset vector). That is, none of
the reset bits are set when the reset occurs. Scanning through the
code, there don't seem to be any obvious places where data is copied
onto the stack, and all of the interrupt handlers are atomic handlers
(making re-entry unlikely). There's some evidence that reboots can be
correlated between nodes: two nodes receive the same packet and reboot.
One thought -- the major difference between the micaz and the telos
stack is the fact that the micaz emulates one of the interrupts with
software. It does this with a 1ms timer. But this interrupt is only
handled when the radio is turned on.
tos/platforms/micaz/chips/cc2420/HplCC2420InterruptsP.nc:
task void CCATask() {
atomic {
if (!ccaTimerDisabled)
call CCATimer.startOneShot(1);
}
}
Phil
More information about the Tinyos-2.0wg
mailing list