[Tinyos-2.0wg] msp430 usart configure
Vlado Handziski
handzisk at tkn.tu-berlin.de
Sat Aug 5 16:37:18 PDT 2006
On 8/5/06, Philip Levis <pal at cs.stanford.edu> wrote:
>
> On Aug 4, 2006, at 8:35 AM, Jonathan Hui wrote:
>
> >
> > Okay, makes sense. I did measure directly the overhead and it takes
> > less than 10us extra processing time and setModeSpi uses about 90
> > bytes more overhead, so I guess it's not that much overhead.
>
> Actually, let's think about this for a moment. 10us is how many
> cycles? 40 or 80? That's actually not insignificant: the latter is
> about the same as a task queue post/run, so essentially doubling the
> overhead of an ownership transition. While a lot of the operations in
> TinyOS are infrequent enough that we've never really worried about
> fine-turning performance, this is a really low-level mechanism that's
> used all over the system (like tasks). It might therefore be
> worthwhile to try to optimize it, or at least not codify an interface
> which is inherently inefficient. While I might buy 10% overhead for
> the simplicity that consistency brings, 100% overhead seems a bit
> much...
>
>
There are several factors at play here, and some are specific to the nature
of the msp430 usart so the solution needs not be replicated "all over the
system". Plus, the functionality of the old and the new code is not the same
so this is not a cost for "consistency" only. I don't have the time to do
real measurement to check the results from Jonathan, but the old code had
optimization that skipped even the limited config that was in setModeSPI if
the device is already in SPI mode. I hope that he measured this on the first
execution of the command.
At the moment I am 100% in thesis writing mode, and will remain so in the
foreseeable future, so here is a more in depth explanation of the decisions
I made, so that you guys can debate this issue without my full
participation.
We have 3 HALs (UART/SPI/I2C) that use a single HPL (USART). For the users
of the HALs we want to create the impression that the underlying device is
just a normal UART/SPI/I2C. This goes for the data path as well as for the
configuration path.
According to the HAA, the HPL has to hide the existence of HW registers from
the HALs. To do this, the HPL has to provide "generic configuration
commands" for get/set of the control registers (the whole register), plus
"specialised configuration commands" for controlling single bits or groups
of bits in the registers with descriptive names like setClockSource,
setBaudRate, etc.
According to the manual, the USART on the msp430 can be configured only in a
reset state. That means that all the config action in the HPL has to be
performed in one go, using either the "generic configuration commands (GC) "
for writing whole registers or a sequence of "specialised configuration
commands (SC)".
At the moment when I did those changes, there were only two SCs for setting
the clock and the baud rate (used incorectly, from non-reset state) and no
GCs for writing the whole registers. The clients were not able to modify the
basic configuration in setModeXXX. I wanted to add this capability but did
not have the time to write all the SCs needed for full configuration of the
USART. So I took the easier way of writing the GCs by which one can control
every aspect of the USART, but is not being able to cherry-pick individual
bits or sets of bits as it would have been possible with the SCs.
Jonathan asked why the parameters and the return values from the GCs are
structs with bit-fields and not simple uint8_t . He is right that the bit
fields have bad reputation because of their non-portability and are less
efficient than macros because one can not work directly on the register
memory location, and because one can not optimize situations like this:
struct_reg.a=1;
struct_reg.b=1;
versus:
byte_reg |= (MASK_a | MASK_b)
However, an HPL is hardware specific by nature, so the non-portability is
not a problem. The mask optimization problem is also not so critical here,
since the HAL needs to do a generic SPI/UART/I2C configuration, and does not
know upfront which bits are going to be set, thus is unable to do the same
trick as above (i.e. the result will be a sequence of SCs that the compiler
can not combine in a single assignment to the register, resulting in the
same overhead as for the bit-fields). So the only remaining overhead is the
extra fetch/modify/write vs. the in-place modification of the memory
location of the register. I can not say how large this overhead is before I
check this on a scope. I want to point out, that the previous discussion is
relevant only when the GCs are called from the HAL. At the HPL level there
is no restriction in directly accessing the registers via their memory
locations. But we can not export simple bytes to the HAL, because then the
HAL will need to use masks to compose/extract information from the
parameters/results.
Next question that was asked is why these GCs are not directly called by the
HALs, i.e. why are we going through an intermediate step of
marshaling/demarshaling of the configuration information in
setModeXXX/configXXX? I did this in order to hide completely the shared
nature of the module from the HALs. In lack of SCs, such
marshaling/demarshaling has to be done either in the HPL like it is done
now, or in the ResourceConfigure.configure() of the HALs because the
configuration information supplied by the HAL clients (CC2420 radio,
TDA5250, etc) is always going to be bus-specific regardles of the fact if it
comes in the form of a struct as does now (i.e. the same way as for the
ADCs) or using a bus-specific configuration interface with many SCs but now
on the level of HAL.
So there are two general options. The current one:
1:
HAL pull config struct from clent -> HAL push config struct to HPL ->
demarshal config struct -> marshal register structs -> apply values using
HPL GCs
and the slightly modified one:
1a:
HAL pull config struct from clent -> HAL demarshal config struct -> HAL
marshal register structs -> apply values using HPL GCs
OR:
2:
HAL pull config from client using HAL SCs -> HAL push config to HPL using
HPL SCs -> SCs directly modify selected bits in the memory location of the
register
Of course, the two can be combined, so that structs can be used in the
client->HAL part and SCs can be used in the HAL->HPL part or vice versa.
It is hard for me to say which solution has the best cost/benefit ratio
because there is no alternative implementation that has the same
functionality as 1. The overhead of 1 can be somewhat reduced by removing
some unnecessary reads if we don't care about disturbing the bits in the
registers that belong to the other modes (since they will be re-configured
anyhow). Unfortunately, I don't have the time to do any of these
modifications/comparisons. If someone wants to take on the task I will try
to help as much as I can, but I can not make any promises.
Vlado
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.millennium.berkeley.edu/pipermail/tinyos-2.0wg/attachments/20060806/12152b06/attachment.html
More information about the Tinyos-2.0wg
mailing list