U.S. Researchers Build Advanced Reasoning Model For Less Than $50

Photo by Sebastien Bonneval on Unsplash

U.S. Researchers Build Advanced Reasoning Model For Less Than $50

Reading time: 2 min

AI researchers from the University of Washington and Stanford trained an AI reasoning model for less than $50—in cloud computing credits—called s1. The team released a paper, titled s1: Simple test-time scaling, with more details of their methodology this Monday.

In a Rush? Here are the Quick Facts!

  • AI researchers from the University of Washington and Stanford trained an AI reasoning model for less than $50 and shared their research this Monday.
  • They used the distillation technique, a test-time scaling, and a supervised fine-tuning approach, with a 1,000-question dataset.
  • The model s1 performs similarly to DeepSeek R1 and OpenAI o1.

According to TechCrunch, the new model performs similarly to advanced models like DeepSeek’s R1, or OpenAI’s o1 and is available on GitHub.

To develop the AI model, the researchers applied a process known as distillation—when a larger AI model provides data to a smaller model—getting reasoning capabilities from Google’s Gemini 2.0 Flash Thinking Experimental.

This process is gaining popularity in the AI industry as OpenAI claims that DeepSeek used the process, without authorization, to develop its advanced reasoning model. Researchers from UC Berkeley’s Sky Computing Lab also recently managed to train a reasoning model for less than $450 with this technique, which is sparking debate in Silicon Valley and anger among large AI companies.

The researchers developing the s1 model also considered a “test-time scaling” approach —by forcing the model to stop and reason more before providing an answer—and performed supervised fine-tuning from a pre-trained model to build its AI reasoning model.

“We develop budget forcing to control test-time compute by forcefully terminating the model’s thinking process or lengthening it by appending ‘Wait’ multiple times to the model’s generation when it tries to end,” states the paper. “This can lead the model to double-check its answer, often fixing incorrect reasoning.”

The experts used a dataset of 1,000 curated questions and answers to train its model in less than 30 minutes using Nvidia H100 GPUs, demonstrating that it’s possible to achieve advanced results with a small database and taking advantage of other technologies and AI models.

“Recent advances in reasoning, such as OpenAi’s o1 and DeepSeek’s R1, lack transparency, limiting broader research progress,” wrote the researchers. “Our work aims to push the frontier of reasoning in a fully open manner, fostering innovation and collaboration to accelerate advancements that ultimately benefit society.”

Did you like this article? Rate it!
I hated it I don't really like it It was ok Pretty good! Loved it!

We're thrilled you enjoyed our work!

As a valued reader, would you mind giving us a shoutout on Trustpilot? It's quick and means the world to us. Thank you for being amazing!

Rate us on Trustpilot
0 Voted by 0 users
Title
Comment
Thanks for your feedback
Loader
Please wait 5 minutes before posting another comment.
Comment sent for approval.

Leave a Comment

Loader
Loader Show more...