View Single Post
Old 06-01-2012, 04:56 PM   #22
liteon
Human being with feelings
 
liteon's Avatar
 
Join Date: Apr 2008
Posts: 510
Default

Quote:
Originally Posted by Justin View Post
I'd imagine that Thumb mode shouldn't even be considered, since RAM use isn't a concern. Also I'd be curious whether loading constants via PC-relative addressing and the associated branch is worthwhile; probably it would make more sense to either a) encode as 4 instructions (ugh), or B) make each codehandle have a table of pointers to load from (provided the count is small enough to be addressable). The latter is something I've considered doing for PPC, too, but it doesn't quite seem worth it as PPC can do constant 32 bit loads in 2 instructions...
yep, no thumb mode. the current scheme will also not work with it very well, since the port depends on 4byte offsets (and is using r8). the mode switching in itself is a bit confusing, complemented by the cpu model naming scheme that arm uses.

as far as i know the pc method of loading is the safest and the only way to load a full 32bit value.
there is also mvn (move + not), which can do for example:
ldr r0, =0xffffff00
could be:
mvn r0, #255
but will not work for 0xfffffe00.

gcc seems to use it quite a lot event for smaller values. this is a dump of the end of the <main> branch:
Code:
   188f8:	0002a87c 	andeq	sl, r2, ip, ror r8
   188fc:	0002a884 	andeq	sl, r2, r4, lsl #17
   18900:	0002a89c 	muleq	r2, ip, r8
   18904:	0002a8a4 	andeq	sl, r2, r4, lsr #17
   18908:	0002a8b8 	streqh	sl, [r2], -r8
   1890c:	0002a8bc 	streqh	sl, [r2], -ip
   ...
the second method you propose is something i've considered as well. there is already an address table dumped into a pool in GLUE_CALL_CODE (but it really should be in c, i think, and passed as a __asm parameter like you do with "consttab"). the table itself is passed to the nseel_asm_... methods to provide some function pointers, because i wasn't able to get the correct addresses of such in any other way. 256 values would be hardly reachable at this point, i think.

this would take loading a full double 2 instructions (or ~4 cycles (edit)) instead of 4.

--

Last edited by liteon; 06-03-2012 at 06:29 AM.
liteon is offline   Reply With Quote