A large language model from Europe for the world
While Prof. Dr. Sepp Hochreiter was giving his keynote at NeurIPS, the new 7B model went live. The xLSTM architecture is available on GitHub at NXAI, and a pre-trained model is available for fine-tuning on Huggingface.
16 Dec 2024Share
“Our scaling predictions from our work have come true. With the xLSTM 7B model, we present the best Large Language Model (LLM) based on recurrent neural networks (RNNs). It is the most energy-efficient model in the world of large language models with fast inference,” explains Hochreiter. He teaches at the JKU Linz and is Chief Scientist at NXAI.
“We are pleased that many people can integrate the advantages of our architecture into their products and develop their own applications based on the xLSTM 7B model. Especially AI applications in the edge and embedded area benefit enormously from the high efficiency and speed of our model. Every researcher worldwide can use the xLSTM 7B model for their work. It is a model from Europe for the world,” explains Hochreiter.
xLSTM is more than just an LLM
Since the xLSTM architecture was first published in the spring of this year, many developers have already presented solutions based on this approach. xLSTM is particularly popular in the industrial sector. “I see great potential for xLSTM in robotics because it is significantly faster and more memory-efficient in inference,” explains Hochreiter.
A few days ago, a research paper recommended a Large Recurrent Action Model (LRAM) for robotics based on xLSTM. Industry experts also report that the architecture is also being used in mobility applications thanks to its longer and variable memory. The same applies to medical technology and life science applications. “In addition, xLSTM is already being used for time series forecasts and shows superior performance in long-term forecasts compared to other methods,” reports Hochreiter. From the developers' point of view, xLSTM is more than just an LLM.
The background: In contrast to Transformer technology, xLSTM calculations only increase linearly with the length of the text and require less computing power during operation. This is a major advantage, as complex tasks require much more text for both the task description and the solution.
Related Exhibitors
Related Speakers
Related Events
Interested in news about exhibitors, top offers and trends in the industry?
Browser Notice
Your web browser is outdated. Update your browser for more security, speed and optimal presentation of this page.
Update Browser