Hundreds of billions of tokens are available for free, and Tsinghua-affiliated AI companies are taking advantage of the opportunity.

Latest update time：2024-03-31

Reads：

Yunzhong from Aofei Temple
Quantum Bit | Public Account QbitAI

10 billion token subsidies, free from April!

This time, the deal comes from a Tsinghua-affiliated AI company. Wuwen Xinqiong , both businesses and individuals can take advantage of this!

The company was founded in May 2023 with the goal of creating the best computing power solution integrating software and hardware for large models.

Just recently, it released the Infini-AI large-scale model development and service platform , based on a multi-chip computing platform , allowing developers to experience and compare the performance of various models and chips.

After the big model wave emerged, some people joked:

Rather than "benefiting mankind", the big model should first call for "give me the scene".

However, Wuwen Xinqiong believes that after experiencing the rapid development of the Internet era, the Chinese market is not short of application scenarios.

The difficulty in implementing large models lies in the computing power problem that the industry continues to encounter .

Rather than “give me a scenario”, we should first call for “solving the computing power problem”.

And this is what Wuwen Xinqiong is doing.

Let developers spend less cost, use good tools and have sufficient computing power

Today, Wuwen Xinqiong released the Wuqiong Infini-AI large-scale model development and service platform based on a multi-chip computing power base .

The company also announced that full registration will officially open starting March 31st, providing a free quota of 10 billion tokens to all real-name registered individuals and businesses .

Developers can experience and compare various model capabilities and chip effects on this platform.

By simply dragging and dropping various parameter buttons, you can fine-tune a large model that is more suitable for your business and deploy it on Infini-AI; then provide services to users at a very favorable unit price of 1,000 tokens.

At present, Wuqiong Infini-AI has supported more than 20 models including Baichuan2, ChatGLM2, ChatGLM3, ChatGLM3 closed-source model, Llama2, Qwen, Qwen1.5 series, as well as more than 10 computing cards such as AMD, BiRen, Cambrian, Suiyuan, Tianshu Zhixin, Muxi, Moore Thread, NVIDIA, etc. , supporting joint optimization and unified deployment of software and hardware between multiple models and multiple chips.

Models trained and fine-tuned on third-party platforms or by custom means can also be seamlessly migrated and hosted on Infini-AI, and receive fine-grained customized token-based billing solutions.

"Our coverage of model brands and chip brands will continue to increase, and as time goes by, the cost-effectiveness of Infini-AI will become more and more prominent." Xia Lixue, co-founder and CEO of Wuwen Xinqiong, said that in the future, Infini-AI will also support the listing of more models and computing power ecosystem partners' products, allowing more large model developers to "spend little money and use a large pool", and continuously reduce the implementation cost of AI applications.

A month ago, Tongdao Liepin launched its AI-driven digital interviewer product in some cities, and more AI features are in the pipeline.

This is the flexible computing power usage solution provided by Wuwen Xinqiong, and it is fine-tuned based on the open source large model on the Wuwen Xinqiong platform.

Compared to other solutions on the market, this achieves higher inference acceleration and significantly reduces the cost of launching new features. Xia Lixue said this result gives the Wuqiong team great confidence.

Therefore, in addition to opening up full registration, we have also officially launched test invitations for those with large computing power needs, providing more cost-effective computing power and more in-depth computing power optimization services in algorithms and hardware.

Companies that encounter computing power problems

Companies that want to apply large models in mature scenarios have found computing power but do not know how to use it, and are unable to create differentiated products to achieve business upgrades.

For companies that want to create AI-Native applications, the computing power costs are difficult to afford, the tool chain is not easy to use, and the product launch and production ratio is unreasonable.

As their business expands, companies that train their own models often cannot find or afford the required computing power, resulting in excessively high operating costs.

By the end of 2023, China's total computing power reached 1.97 quadrillion floating-point operations per second (197E FLOPs) , ranking second in the world. The average annual growth rate of computing power in the past five years has been nearly 30%.

With such a growth rate, why does the industry still feel that computing power is particularly difficult?

The reason behind this is that the development of the AI industry coincided with the explosion of the engineering talent dividend, which accelerated the vigorous development of my country's large-scale model industry. The demand side is "crying out for food", while there are still a large amount of computing power resources on the market that have not been collected and fully utilized. There is a lack of a sufficiently systematic "large-scale model native" business model to transform computing power supply into products and services that meet market demand.

The computing power cost-effectiveness has been greatly improved, thanks to the multi-chip optimization strength

"There is a lot of unactivated effective computing power on the market, and the gap in hardware itself is narrowing rapidly, but people always encounter 'ecological problems' when using it." Xia Lixue said that this is because the iteration speed of hardware is always slower than that of software and the price is higher. Software developers do not want other "variables" other than their own research and development work to appear in their work, so they always tend to directly use chips with mature ecosystems.

Wuwen Xinqiong hopes to help all teams building large models "control variables", that is, when using Wuwen Xinqiong's computing power services, users do not need and will not feel the brand differences in the underlying computing power.

How did Wuwen Xinqiong, which was established less than a year ago, manage to achieve performance optimization on multiple computing cards in such a short period of time?

At the end of 2022, after the big model attracted widespread attention from the society, Xia Lixue and his mentor Wang Yu believed that the overall computing power level in China still had a significant gap from the international advanced level. It was far from enough to rely solely on chip technology improvements or the iteration of multiple chips. It was necessary to establish a big model ecosystem so that different models could be automatically deployed on different hardware and various computing powers could be effectively utilized .

A year later, Wuwen Xinqiong announced the optimization results achieved on chips such as NVIDIA GPU and AMD, achieving a 2-4 times increase in inference speed for large model tasks.

Subsequently, AMD China announced a strategic partnership with Wuwen Core Qiong, and the two parties will work together to improve the performance of commercial AI applications.

Two years later, Wuwen Xinqiong demonstrated its performance optimization data on 10 chips at a press conference, showing that it had achieved the best performance optimization effect in the industry on each card.

"We have established strong trust with all our model and chip partners," said Xia Lixue. "This is partly due to our computational optimization capabilities for large models, and partly due to our strong focus on protecting our partners' data security. Wuwen Xinqiong will continue to maintain neutrality and will not create conflicts of interest with our customers. This is the foundation of our business."

Developing a "large model native" acceleration technology stack and system

"Transformer has unified this wave of model structures and is demonstrating a continued trend of breakthroughs in applications," Wang Yu said in his opening remarks. "In the AI 1.0 era, as a company, we could only handle a small subset of AI tasks. Today, however, the landscape is different. Large-scale model structures have been unified, and the hardware barriers built on the ecosystem are thinning."

Thanks to the surging AI wave around the world and the unique opportunities in the Chinese market, Wuwen Xinqiong is facing a huge technological opportunity.

Transformer is naturally designed based on a parallel computing architecture. The larger the model, the better the intelligent effect it brings. The more people use it, the greater the amount of computing required.

"Wuwen Xinqiong is developing a 'large-model native' acceleration technology stack," said Yan Shengen, co-founder and CTO of Wuwen Xinqiong. The implementation of large models relies on algorithms, computing power, data, and systems. Computing power determines the speed of large models, while well-designed systems can unleash the potential of hardware.

Wuwen Xinqiong's team has built a large-scale, high-performance AI computing platform with tens of thousands of GPUs, capable of managing thousands of systems, and has successfully built a cloud management system based on its own cluster, achieving unified scheduling across multiple domains and clouds.

One More Thing

"On the device side, people are more inclined to quickly implement the capabilities of large models into human-computer interaction interfaces to enhance the practical experience." Dai Guohao, co-founder and chief scientist of Wuwen Xinqiong, believes that in the future, AGI-level intelligence will emerge wherever computing power is available. The source of intelligence on each device is the LPU, a dedicated processor for large models.

The large model processor LPU can improve the energy efficiency and speed of large models on various end-side hardware.

At the press conference, Dai Guohao demonstrated the "One Card Runs Large Models" demonstration. His team launched the world's first FPGA-based large-model processor in early January of this year. By leveraging hardware-software co-optimization technology for efficient large-model compression, this processor reduced the FPGA deployment cost of the LLaMA2-7B model from four cards to one, achieving superior price-performance and energy efficiency compared to GPUs using the same process technology. In the future, Wuwen Core's dedicated large-model processor IP for edge devices will be modularly integrated into various edge chipsets.

"From cloud to end, we will carry out the integrated optimization of software and hardware to the end. We will significantly reduce the implementation cost of large models in various scenarios, so that more easy-to-use AI capabilities can enter the lives of more people better and more affordably." Dai Guohao announced that the Wuqiong LPU will be available in 2025.

-over-

The registration for the selection is about to close!

AIGC companies and products worth watching in 2024

QuantumBit is currently selecting the AIGC Enterprise of 2024 and the AIGC Product of 2024. Two award categories are available. Register now ! Application deadline: March 31, 2024

Registration for the China AIGC Industry Summit "Hello, New Apps!" is now open! Click here to register . The summit will also be livestreamed online.

Click here ???? Follow me and remember to mark the star