Lately, Groq has been getting a lot of similar complaints from its customers, says its CEO, Ross: They want to pay the startup more money to get more access to its chips. It's a problem most startups would love to have.

While many AI companies have focused on training large language models, Groq seeks to make them run as fast as possible using chips it developed called LPUs, or language processing units. The gambit is that as AI models get better, inference — the part where the AI makes decisions or answers questions — will demand more computing power than training will, positioning Groq to reap the rewards.

But Groq isn't the only game in town. Rivals big and small are trying to carve out a space in the market — including Nvidia, which dominates the market for training and similarly sees inference as the next big thing. Groq's special (and tightly patented) sauce is its specialized chip design says Ross. "There's a lot of counterintuitive stuff that we've done," he tells Business Insider.

Groq raised $640 million in August — earning it a $2.8 billion valuation — and Ross says the company has healthy profit margins on half of its available models. The company also has a goal to ship 108,000 LPUs by Q1 of next year — and 2 million chips by the end of 2025, most of which will be made available over the cloud. It will require a lot of work with supply chains and winning over partners. "If we do that, we do believe we will be providing more than half the world's inference at that point," says Ross.

The industry has made leaps and bounds since Ross worked at Google from 2011 to 2016, where he worked on improving the technology behind its ads. While there, he came to view AI's computing demands as prohibitive at the time.

He recalls Google's AI chief, Jeff Dean, giving a presentation to the leadership team that had just two slides and two points: AI works, but Google can't afford it. Dean asked Ross's team to design a chip based on a specific type of integrated circuit they were using, and the result was Google's first tensor processing unit, a chip designed specifically for AI.

It wasn't much later that Ross's team received a cryptic message from an Alphabet group Ross barely knew about, saying they had an AI model and asking whether the TPU chip was as good as people were saying.

The group was DeepMind, and just a few weeks later it took its AI model — ported onto the TPU Ross's team had designed — to defeat Lee Sedol, a world champion, in the game Go. Watching the AlphaGo program land a complex "shoulder hit" move on its opponent was validation for Ross that faster inference meant better, smarter AI.

Jump forward a decade, and today, Groq is preparing to produce its second-gen chip, which it says will offer a two to three times jump in efficiency across speed, cost, and energy consumption. Ross describes it as "like skipping from fifth grade all the way to your Ph.D. program."

See Business Insider's full AI Power List

Read the original article on Business Insider