[Tinyos Core WG] Re: [Tinyos-devel] micaz (atm128?) serial problem in 2.x

Philip Levis pal at cs.stanford.edu
Tue Oct 3 09:04:03 PDT 2006


On Oct 3, 2006, at 2:20 AM, Philip Levis wrote:

> On Oct 2, 2006, at 9:12 PM, eweddington at cso.atmel.com wrote:
>
>> Hi Phil,
>>
>> I would rule out a stray memory bug causing the watchdog to be  
>> enabled.
>> On the mega128, it requires a specific code sequence to enable the
>> watchdog, which a stray memory bug probably won't be able to do.
>
> Agreed. But looking at the reset conditions, I'm hard pressed to  
> come up with an alternative explanation. There are five causes:
>
> Power-On Reset
> External Reset
> Watchdog Reset
> Brown-out Reset
> JTAG AVR Reset
>
> I first encountered the problem on motes that were plugged into  
> EPRB boards so had power over ethernet. It only happens when they  
> are put under significant load. I've had it happen on nodes that  
> were battery powered as well.
>
>> IIRC, there is a register in the m128 that allows you to discover the
>> source of a reset. Sorry, I don't have the datasheet in front of  
>> me, so
>> I don't know the specific register. But I would suggest putting in  
>> some
>> debug code to take a look at that and see if that provides any  
>> further
>> clues.

Some further information (thanks everyone who sent me pointers).

The student set up the boot sequence (RealMainP) to read MCUCSR. He  
ran a series of tests, and came up with these results:

Trial  Dead   MCUCSR
#1 : node 4 - 0x02
#2 : node 5 - 0x0B
#3 : node 5 - 0x0B
#4 : node 1,2 - 0x0A at the same time (with precision of 1 sec)
#5 : no error until 4 min
#6 : node 4 - 0x02, node 5 - 0x0B at the same time.

One thought was that these nodes are powered over ethernet, and there  
was recently some discussion on a software upgrade to the router that  
might have changed the maximum draw of the EPRBs, such that if lots  
of the motes are active at once they brownout. But nodes on my desk  
powered with AA batteries failed as well.

These varied causes lead me to believe that the cause is something  
Robert Adler pointed out: if you corrupt the stack, it's possible  
that you'll jump to the reset vector. This could also explain why  
MCUCSR is might have these weird values (JTAG reset?!?!?). MCUCSR is  
at address $34 ($54), and immediately follows TCCR0. The atm128 data  
sheet doesn't say whether you can write to MCUCSR: I assume that you  
can.

Time to break out the JTAG...

Phil



More information about the Tinyos-devel mailing list