top of page
Buscar

Why is this paper the real revolution of AI?

  • Foto del escritor: Juan Carlos Galindo
    Juan Carlos Galindo
  • 23 jul 2024
  • 1 Min. de lectura


ree


Energy and computational resources pose the true challenge for the scalability of AI. In this paper, Microsoft Research presents a new approach to quaternion models, BitNet b1.58.

This approach reduces the amount of resources consumed for big data processing. The trend is to increase the amount of data in the model and increase computational consumption to generate more powerful AI neural networks. However, this trend represents millions of dollars in computers and energy resources.

Imagine having a powerful model like ChatGPT running locally in your pocket. That is a possibility with this approach.

The paper proposes that every single parameter (or weight) of the LLM is ternary {-1,0,1}. It matches the full precision on Fp16 or BF16 Transformer LLM with the same model size.


This would represent a new paradigm that allows:



  • Training models from scratch with a 1-bit model, not after training quaternionization.

  • Achieving 4 times faster processing, 7 times less memory usage, and, most importantly, 40 times more energy efficiency.


What previously required several additions and multiplications of matrices can now be synthesized to just additions and subtractions for the input variables of the model.

Finally, they flip the coin to Nvidia, which previously bet on quaternionization, even to FP8 or FP4 processor approaches.


Original paper from Microsoft Research:



 
 

Stay Updated on the Latest in VFX and Filmmaking tech   Subscribe Now!

Unreal Engine, machine learnig, virtual production

Thanks for submitting!

bottom of page