Full Abstract (application had short version)

Dear NeurIPS EXPO Talk Review Team,

As a Platinum-level sponsor, we are excited to apply for an EXPO Talk. A detailed outline, including references, can be accessed via the provided URL that will lead you to a published Notion page.

In this talk, we will discuss how innovations in number systems - such as logarithmic math - and their co-designed hardware can accelerate and significantly influence the adoption of AI.

Along with a comprehensive theoretical overview, we will explore the practicality of implementing these innovations in hardware and provide quantitative examples at both the single-operation level and the broader AI system and model level.

The talk will be divided into four sections: (1) the impact of inference system design on AI adoption, (2) a review of recent trends in low-precision data types, (3) our research on logarithmic math as a supplement or alternative to low-precision data types (4) lessons learned from co-designing logarithmic math and AI hardware.

AI adoption in enterprises is primarily driven by three factors: (A) Trust, particularly in the quality of AI system outputs, (B) Cost, both for system implementation and ongoing operation, and (C) User Experience (UX), including ease of deployment.

We will explain why low-precision data types are widely regarded as the most effective way to improve the Cost of AI compute, and review the common data types used today in AI systems, particularly in large neural networks for language, vision, and multimodal processing.

However, the use of low-precision data types can also affect the output quality of neural networks in non-trivial ways such as outlier performance, posing challenges for trust and UX in AI adoption. We will provide concrete examples which display the often hard-to-measure trade-offs between cost, trust, and UX and how they are challenging for enterprises, to then present a potential solution in the following sections.

Recently, the logarithmic number system has gained attention in both academia and industry for its ability to replace multiplications with additions, reducing chip area and power consumption by factors of 3 to 4x on an atomic level, with higher secondary system-level gains.

However, the required mapping from logarithmic to linear space in multiply-accumulate operations poses a significant challenge in terms of its implementation in hardware.

We will compare different approaches such as LUTs, Taylor Series, or the Mitchell Approximation - log2(1+x)=x - in terms of accuracy, technical feasibility in silicon and power efficiency.

In addition to covering the mathematical core principles of logarithmic math, we will present an elegant improvement to the Mitchell Approximation which in its original form is infeasible for large genAI models, because it requires costly quantization-aware training, as did our first-generation implementation in silicon. We will explain how this improvement arguably renders logarithmic math a pareto-optimal number system in terms of power vs. precision, and thereby makes it ideal for running large multi-modal models that require high dynamic range activations as well as accurate regressions.

With that, we will demonstrate how logarithmic math, improves trust and UX compared to traditional linear math currently used in the industry, and present quantitative results at the network level, such as for large LLMs and diffusion models - displaying sub-0.1% accuracy losses (against baseline IEEE 32/16 bit models) across a range of models, while keeping costs low with a system-level power consumption comparable to running the entire model in 4 bit precision on floating-point hardware.

In the final part of our talk, we will explain how the reduced chip area and power from using logarithmic math comes with secondary benefits in silicon design, such as more flexibility in the design of on- and off-chip data paths and a more balanced general-to-special-compute-ratio, leading to higher overall utilization and further lowering the cost of AI inference computation.

We will also share our experience in co-designing algorithms and hardware systems, emphasizing the importance of closing the loop between these two disciplines.

With seven years of foundational innovations in logarithmic math and proven hardware systems based on these developments, we consider ourselves pioneers in this field and are excited to share our learnings and solutions.