The development of so-called reasoning AI models is becoming increasingly easier – and cheaper.
On Friday, NovaSky, a team of researchers at UC Berkeley's Sky Computing Lab, released Sky-T1-32B-Preview, a reasoning model that competes with an earlier version of OpenAI's o1 in various key benchmarks. Sky-T1 appears to be the primary truly open source reasoning model within the sense that it will possibly be replicated from scratch; The team released the dataset they used to coach and the required training code.
“Notably, Sky-T1-32B-Preview sold for lower than $450,” the team wrote in a single Blog post“demonstrating that it is feasible to breed high-level reasoning skills inexpensively and efficiently.”
Unlike most AI models, reasoning models effectively self-check the facts, allowing them to avoid a number of the pitfalls that typically trip up models. Reasoning models take somewhat longer – often seconds to minutes longer – to reach at solutions in comparison with a typical non-reasoning model. The advantage is that they have an inclination to be more reliable in areas similar to physics, science and arithmetic.
The NovaSky team says it used a special reasoning model, Alibaba's QwQ-32B-Preview, to generate the initial training data for Sky-T1, then “curated” the information mix and leveraged OpenAI's GPT-4o-mini to convert the information right into a more workable format. Training the Sky-T1 with 32 billion parameters took about 19 hours with a rack of 8 Nvidia H100 GPUs. (Parameters roughly correspond to a model's problem-solving capabilities.)
According to the NovaSky team, Sky-T1 performs higher than an early preview version of o1 on MATH500, a set of “competition-level” math challenges. The model also outperforms o1's preview on various difficult problems from LiveCodeBench, a coding assessment.
However, Sky-T1 falls wanting the o1 preview on GPQA-Diamond, which accommodates questions on physics, biology and chemistry that a PhD student should know.
It can also be vital to notice that OpenAI's GA version of o1 is a stronger model than the preview version of o1 and that OpenAI is anticipated to release a good more powerful reasoning model, o3, in the approaching weeks.
However, the NovaSky team says Sky-T1 marks just the start of its journey to develop open source models with advanced reasoning capabilities.
“In the long run, we’ll give attention to developing more efficient models that maintain strong reasoning performance and exploring advanced techniques that further improve the efficiency and accuracy of the models at test time,” the team wrote within the post. “Stay updated as we make progress on these exciting initiatives.”