Dillobits Software

Smart software for smart people

 

The 64-bit advantage

When AMD's design team created the x86-64 instruction set architecture (ISA), they tackled several inherent deficiencies of the old x86 ISA. First and foremost among those was a very basic limitation of accessing memory with 32-bit addresses: the sum total of memory one can address at one time with a 32-bit number is 4GB. That may sound like a lot of memory for the average desktop PC, but then again, not every PC is average, and the x86 ISA is increasingly becoming the platform of choice for technical workstations and servers, as well. As memory densities increase over time thanks to the happy benefits of Moore's Law, that 4GB limit is beginning to look smaller and smaller. In fact, Intel's new I7 architecture with it's triple channel memory controller is typically provisioned with 2GB banks yielding a total of 6GB necessitating the use of a 64 bit OS to fully exploit. By moving to a 64-bit addressing scheme on newer AMD and Intel processors, the possible address space grows exponentially from 2^32 to 2^64, so that the x86-64 ISA allows for what seems like a practically unlimited amount of memory.

Another problem with the x86 ISA is the number of general-purpose registers (GPRs) available. Registers are fast, local slots inside a processor where programs can store values. Data stored in registers is quickly accessible for reuse, and registers are even faster than on-chip cache. The x86 ISA only provides eight general-purpose registers, and thus is generally considered register-poor. Most reasonably contemporary ISAs offer more. The PowerPC 604 RISC architecture, to give one example, has 32 general-purpose registers. Without a sufficient number of registers for the task at hand, x86 compilers must sometimes direct programs to spend time shuffling data around in order to make the right data available for an operation. This creates overhead that slows down computation.

To help alleviate this bottleneck, the x86-64 ISA brings more and better registers to the table. x86-64 packs 8 more general-purpose registers, for a total of 16, and they are no longer limited to 32-bit values, all 16 can store 64-bit data types. In addition to the new GPRs, x86-64 also includes 8 new 128-bit SSE/SSE2 registers, for a total of 16 of those. These additional registers bring x86 processors up to snuff with the competition, and they will quite likely bring the largest performance gains of any aspect of the move to the x86-64 ISA.

Our applications make extensive use of timestamps typically represented as 64 bit values and are well positioned to take direct advantage of the 64 bit arithmetic operators available in x86-64 ISA. The availability of more registers also allows our recompiled applications to take full advantage of the resulting better optimized code generated by x86-64 ISA aware compilers.