a glob of nerdishness

October 20, 2007

Mach-O binaries as seen by otool

written by natevw @ 6:58 am

On OS X, you use otool instead of ldd to view required shared libraries of an executable. An interesting browse, proofread by slashdot last year, providing further reading as well. (See also the Firefox Poster from Source Code on that blog.)

April 2, 2007

Assembly primer

written by natevw @ 8:09 pm

I’ve been doing a lot of C++ coding at work lately, and sometimes the only “source” Xcode can show when debugging a compiled library is assembly code. Could knowing assembly language help debug C++ code? Short answer: no. There’s better techniques, and the assembly code often takes me closer to the machine (and the deadline) than I need to be in those instances.

Yet I’ve had a longtime interest in assembly code, for numerous reasons(1). Take this tiny piece of segmentation-faulting assembly:

call *8(%edx)

I had a hunch edx was a processor register (a “variable” of sorts), and sure enough Xcode’s debugger showed a 0 in the EDX register. “Call” then seems to say I’m trying to execute code from a bad function pointer that got written into EDX. But what’s with the “*8″, and does the ‘%’ mean anything special? Enter a great tutorial on AT&T Assembly Syntax, which happens to be what the GNU toolset, and therefore Xcode, uses(2). From that, we see that the percent sign is just a sigil that prefixes a register name.
The “*8(” part is a bit trickier. Under “Memory Addressing”, we see that a memory takes the form of “segment-override:signed-offset(base,index,scale)”. Don’t ask me what all that is, but it seems in our case we can simplify that down to “signed-offset(base)”. Lower down, we see that “Branch addressing using registers or memory operands must be prefixed by a ‘*’”. So it appears that this instruction says: Call the code that is located 8 past the address in the EDX register. Cool!(3)

It’s still on my to-read list, but if assembly language is interesting, you might want to check out Paul A. Carter’s PC Assembly Language free PDF book, geared towards assembly language from C. Let me know how it goes!

  1. …from the days when the processor was closer, and just wanting to know how it worked, as well as wanting to write über-optimized code, modify my GPS’s firmware and make things with embedded processors…with only an assembler and raw coder-manliness! Now that Apple uses x86 on their desktop machines, that’s all the more reason to learn
  2. To fully decode Assembly Language, you’ll also need a mnemonic reference for your architecture. The sig9 tutorial uses IA-32, which is has a good chance of being what you’re using.
  3. Or in our case, not so cool. Since the EDX register contains 0, this would call code at address 0×8, which isn’t our code. Thankfully, the kernel detects an address this messed up and puts a quick stop to the program. However, most of those worrisome “arbitrary code execution” security holes which Microsoft was particularly good at work using similar unexpected-address calls: a cracker finds a way to a) put some machine code into memory, and b) put a “call” into the list of upcoming instructions that will run said machine code.

February 3, 2007

Fat vs. Universal

written by natevw @ 7:52 pm

Sounds like the MPAA looking to devour another tasty morsel compliments of our fine legislative system. But today’s topic is Virtual Machines, not virtual monopolies. By “Virtual Machine”, I don’t mean “Virtual Computer”, I mean “Virtual Processor”. There’s much to be said about the uses and shortcomings of virtual machines. I’d like to focus specifically on how the technology could, or could have, benefit Apple formerly Computer.

Apple has the privilege of designing it all: hardware, operating system and a leading share of the applications. Contrast “design” with “dictate” — it wouldn’t be practical for them to design all the chips they need, nor are they able to write software for every niche that has a need. Even their operating system, whether due to beneficence or lack of market domination, follows an admirable number of external standards. One of these is LLVM, a Virtual Machine standard already integrated into part of Leopard’s graphics system.

The two outside ends, hardware and applications, are the areas where Apple must take extra care in its design to give some deference to the plans of its suppliers and developers. And when it comes to processor choice, both come into play. Apple has already led a fairly smooth transition from PowerPC-compiled applications to so-called “Universal Binaries“. These programs are really just fat binaries, a technology which has been in use on the Mac for quite a while now.

Fat -vs- Universal logos

“Universal” is a misnomer: the text may read one way but, as the dyadic logo implies, the reality is that most Universal Binaries are compiled for only 2 platforms, PowerPC and Intel. The next time Apple asks its developers to “check the box” and recompile could be the last for a long time if they move to an actual Universal Binary compiled to run on top of a Virtual Machine. Apple’s involvement in LLVM has been known since at least 2005, and I have no doubt they could easily pull it off. It might feel like joining the ranks of Microsoft “Why are they yelling?” .NET and Java with its “write once, ugly anywhere”, but I think it would still warrant applause at a future WWDC convention.

January 13, 2007

C programmer whoa/duh moment

written by natevw @ 10:03 am

While searching for info on how the “ISA Reference” item came to be in my application services menu, I came across a fantastic post on a blog which just got added to the sidebar: Generating Machine Code at Runtime. Esentially, you store machine code in an array (anyone remember POKE in the good old days?) and call the pointer as a function. I was stunned, and then shamed. What else is a function pointer if not a pointer to machine code in memory?