Machine-Learning System Based on Light Could Yield More Powerful, Efficient Large Language Models

Posted by
Check your BMI

MIT system demonstrates greater than 100-fold improvement in energy efficiency and a 25-fold improvement in compute density compared with current systems.

Copyright: news.mit.edu – “Machine-Learning System Based on Light Could Yield More Powerful, Efficient Large Language Models”


toonsbymoonlight
ChatGPT has made headlines around the world with its ability to write essays, email, and computer code based on a few prompts from a user. Now an MIT-led team reports a system that could lead to machine-learning programs several orders of magnitude more powerful than the one behind ChatGPT. The system they developed could also use several orders of magnitude less energy than the state-of-the-art supercomputers behind the machine-learning models of today.

In the July 17 issue of Nature Photonics, the researchers report the first experimental demonstration of the new system, which performs its computations based on the movement of light, rather than electrons, using hundreds of micron-scale lasers. With the new system, the team reports a greater than 100-fold improvement in energy efficiency and a 25-fold improvement in compute density, a measure of the power of a system, over state-of-the-art digital computers for machine learning.

Toward the future

In the paper, the team also cites “substantially several more orders of magnitude for future improvement.” As a result, the authors continue, the technique “opens an avenue to large-scale optoelectronic processors to accelerate machine-learning tasks from data centers to decentralized edge devices.” In other words, cellphones and other small devices could become capable of running programs that can currently only be computed at large data centers.

Further, because the components of the system can be created using fabrication processes already in use today, “we expect that it could be scaled for commercial use in a few years. For example, the laser arrays involved are widely used in cell-phone face ID and data communication,” says Zaijun Chen, first author, who conducted the work while a postdoc at MIT in the Research Laboratory of Electronics (RLE) and is now an assistant professor at the University of Southern California.[…]

SwissCognitive, World-Leading AI Network.