Quietly, and determinedly since 2019, we’ve been working on Generative Quantum AI. Our early focus on building natively quantum systems for machine learning has benefitted from and been accelerated by access to the world’s most powerful quantum computers, and quantum computers that cannot be classically simulated.
Our work additionally benefits from being very close to our Helios generation quantum computer, built in Colorado, USA. Helios is 1 trillion times more powerful than our H2 System, which is already significantly more advanced than all other quantum computers available.
While tools like ChatGPT have already made a profound impact on society, a critical limitation to their broader industrial and enterprise use has become clear. Classical large language models (LLMs) are computational behemoths, prohibitively huge and expensive to train, and prone to errors that damage their credibility.
Training models like ChatGPT requires processing vast datasets with billions, even trillions, of parameters. This demands immense computational power, often spread across thousands of GPUs or specialized hardware accelerators. The environmental cost is staggering—simply training GPT-3, for instance, consumed nearly 1,300 megawatt-hours of electricity, equivalent to the annual energy use of 130 average U.S. homes.
This doesn’t account for the ongoing operational costs of running these models, which remain high with every query.
Despite these challenges, the push to develop ever-larger models shows no signs of slowing down.
Enter quantum computing. Quantum technology offers a more sustainable, efficient, and high-performance solution—one that will fundamentally reshape AI, dramatically lowering costs and increasing scalability, while overcoming the limitations of today's classical systems.
At Quantinuum we have been maniacally focused on “rebuilding” machine learning (ML) techniques for Natural Language Processing (NLP) using quantum computers.
Our research team has worked on translating key innovations in natural language processing — such as word embeddings, recurrent neural networks, and transformers — into the quantum realm. The ultimate goal is not merely to port existing classical techniques onto quantum computers but to reimagine these methods in ways that take full advantage of the unique features of quantum computers.
We have a deep bench working on this. Our Head of AI, Dr. Steve Clark, previously spent 14 years as a faculty member at Oxford and Cambridge, and over 4 years as a Senior Staff Research Scientist at DeepMind in London. He works closely with Dr. Konstantinos Meichanetzidis, who is our Head of Scientific Product Development and who has been working for years at the intersection of quantum many-body physics, quantum computing, theoretical computer science, and artificial intelligence.
A critical element of the team’s approach to this project is avoiding the temptation to simply “copy-paste”, i.e. taking the math from a classical version and directly implementing that on a quantum computer.
This is motivated by the fact that quantum systems are fundamentally different from classical systems: their ability to leverage quantum phenomena like entanglement and interference ultimately changes the rules of computation. By ensuring these new models are properly mapped onto the quantum architecture, we are best poised to benefit from quantum computing’s unique advantages.
These advantages are not so far in the future as we once imagined – partially driven by our accelerating pace of development in hardware and quantum error correction.
The ultimate problem of making a computer understand a human language isn’t unlike trying to learn a new language yourself – you must hear/read/speak lots of examples, memorize lots of rules and their exceptions, memorize words and their meanings, and so on. However, it’s more complicated than that when the “brain” is a computer. Computers naturally speak their native languages very well, where everything from machine code to Python has a meaningful structure and set of rules.
In contrast, “natural” (human) language is very different from the strict compliance of computer languages: things like idioms confound any sense of structure, humor and poetry play with semantics in creative ways, and the language itself is always evolving. Still, people have been considering this problem since the 1950’s (Turing’s original “test” of intelligence involves the automated interpretation and generation of natural language).
Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing.
Initial ML approaches were largely “statistical”: by analyzing large amounts of text data, one can identify patterns and probabilities. There were notable successes in translation (like translating French into English), and the birth of the web led to more innovations in learning from and handling big data.
What many consider “modern” NLP was born in the late 2000’s, when expanded compute power and larger datasets enabled practical use of neural networks. Being mathematical models, neural networks are “built” out of the tools of mathematics; specifically linear algebra and calculus.
Building a neural network, then, means finding ways to manipulate language using the tools of linear algebra and calculus. This means representing words and sentences as vectors and matrices, developing tools to manipulate them, and so on. This is precisely the path that researchers in classical NLP have been following for the past 15 years, and the path that our team is now speedrunning in the quantum case.
The first major breakthrough in neural NLP came roughly a decade ago, when vector representations of words were developed, using the frameworks known as Word2Vec and GloVe (Global Vectors for Word Representation). In a recent paper, our team, including Carys Harvey and Douglas Brown, demonstrated how to do this in quantum NLP models – with a crucial twist. Instead of embedding words as real-valued vectors (as in the classical case), the team built it to work with complex-valued vectors.
In quantum mechanics, the state of a physical system is represented by a vector residing in a complex vector space, called a Hilbert space. By embedding words as complex vectors, we are able to map language into parameterized quantum circuits, and ultimately the qubits in our processor. This is a major advance that was largely under appreciated by the AI community but which is now rapidly gaining interest.
Using complex-valued word embeddings for QNLP means that from the bottom-up we are working with something fundamentally different. This different “geometry” may provide advantage in any number of areas: natural language has a rich probabilistic and hierarchical structure that may very well benefit from the richer representation of complex numbers.
Another breakthrough comes from the development of quantum recurrent neural networks (RNNs). RNNs are commonly used in classical NLP to handle tasks such as text classification and language modeling.
Our team, including Dr. Wenduan Xu, Douglas Brown, and Dr. Gabriel Matos, implemented a quantum version of the RNN using parameterized quantum circuits (PQCs). PQCs allow for hybrid quantum-classical computation, where quantum circuits process information and classical computers optimize the parameters controlling the quantum system.
In a recent experiment, the team used their quantum RNN to perform a standard NLP task: classifying movie reviews from Rotten Tomatoes as positive or negative. Remarkably, the quantum RNN performed as well as classical RNNs, GRUs, and LSTMs, using only four qubits. This result is notable for two reasons: it shows that quantum models can achieve competitive performance using a much smaller vector space, and it demonstrates the potential for significant energy savings in the future of AI.
In a similar experiment, our team partnered with Amgen to use PQCs for peptide classification, which is a standard task in computational biology. Working on the Quantinuum System Model H1, the joint team performed sequence classification (used in the design of therapeutic proteins), and they found competitive performance with classical baselines of a similar scale. This work was our first proof-of-concept application of near-term quantum computing to a task critical to the design of therapeutic proteins, and helped us to elucidate the route toward larger-scale applications in this and related fields, in line with our hardware development roadmap.
Transformers, the architecture behind models like GPT-3, have revolutionized NLP by enabling massive parallelism and state-of-the-art performance in tasks such as language modeling and translation. However, transformers are designed to take advantage of the parallelism provided by GPUs, something quantum computers do not yet do in the same way.
In response, our team, including Nikhil Khatri and Dr. Gabriel Matos, introduced “Quixer”, a quantum transformer model tailored specifically for quantum architectures.
By using quantum algorithmic primitives, Quixer is optimized for quantum hardware, making it highly qubit efficient. In a recent study, the team applied Quixer to a realistic language modeling task and achieved results competitive with classical transformer models trained on the same data.
This is an incredible milestone achievement in and of itself.
This paper also marks the first quantum machine learning model applied to language on a realistic rather than toy dataset.
This is a truly exciting advance for anyone interested in the union of quantum computing and artificial intelligence, and is in danger of being lost in the increased ‘noise’ from the quantum computing sector where organizations who are trying to raise capital will try to highlight somewhat trivial advances that are often duplicative.
Carys Harvey and Richie Yeung from Quantinuum in the UK worked with a broader team that explored the use of quantum tensor networks for NLP. Tensor networks are mathematical structures that efficiently represent high-dimensional data, and they have found applications in everything from quantum physics to image recognition. In the context of NLP, tensor networks can be used to perform tasks like sequence classification, where the goal is to classify sequences of words or symbols based on their meaning.
The team performed experiments on our System Model H1, finding comparable performance to classical baselines. This marked the first time a scalable NLP model was run on quantum hardware – a remarkable advance.
The tree-like structure of quantum tensor models lends itself incredibly well to specific features inherent to our architecture such as mid-circuit measurement and qubit re-use, allowing us to squeeze big problems onto few qubits.
Since quantum theory is inherently described by tensor networks, this is another example of how fundamentally different quantum machine learning approaches can look – again, there is a sort of “intuitive” mapping of the tensor networks used to describe the NLP problem onto the tensor networks used to describe the operation of our quantum processors.
While it is still very early days, we have good indications that running AI on quantum hardware will be more energy efficient.
We recently published a result in “random circuit sampling”, a task used to compare quantum to classical computers. We beat the classical supercomputer in time to solution as well as energy use – our quantum computer cost 30,000x less energy to complete the task than Frontier, the classical supercomputer we compared against.
We may see, as our quantum AI models grow in power and size, that there is a similar scaling in energy use: it’s generally more efficient to use ~100 qubits than it is to use ~10^18 classical bits.
Another major insight so far is that quantum models tend to require significantly fewer parameters to train than their classical counterparts. In classical machine learning, particularly in large neural networks, the number of parameters can grow into the billions, leading to massive computational demands.
Quantum models, by contrast, leverage the unique properties of quantum mechanics to achieve comparable performance with a much smaller number of parameters. This could drastically reduce the energy and computational resources required to run these models.
As quantum computing hardware continues to improve, quantum AI models may increasingly complement or even replace classical systems. By leveraging quantum superposition, entanglement, and interference, these models offer the potential for significant reductions in both computational cost and energy consumption. With fewer parameters required, quantum models could make AI more sustainable, tackling one of the biggest challenges facing the industry today.
The work being done by Quantinuum reflects the start of the next chapter in AI, and one that is transformative. As quantum computing matures, its integration with AI has the potential to unlock entirely new approaches that are not only more efficient and performant but can also handle the full complexities of natural language. The fact that Quantinuum’s quantum computers are the most advanced in the world, and cannot be simulated classically, gives us a unique glimpse into a future.
The future of AI now looks very much to be quantum and Quantinuum’s Gen QAI system will usher in the era in which our work will have meaningful societal impact.
Quantinuum, the world’s largest integrated quantum company, pioneers powerful quantum computers and advanced software solutions. Quantinuum’s technology drives breakthroughs in materials discovery, cybersecurity, and next-gen quantum AI. With over 500 employees, including 370+ scientists and engineers, Quantinuum leads the quantum computing revolution across continents.
From September 16th – 18th, Quantum World Congress (QWC) brought together visionaries, policymakers, researchers, investors, and students from across the globe to discuss the future of quantum computing in Tysons, Virginia.
Quantinuum is forging the path to universal, fully fault-tolerant quantum computing with our integrated full-stack. With our quantum experts were on site, we showcased the latest on Quantinuum Systems, the world’s highest-performing, commercially available quantum computers, our new software stack featuring the key additions of Guppy and Selene, our path to error correction, and more.
Dr. Patty Lee Named the Industry Pioneer in Quantum
The Quantum Leadership Awards celebrate visionaries transforming quantum science into global impact. This year at QWC, Dr. Patty Lee, our Chief Scientist for Hardware Technology Development, was named the Industry Pioneer in Quantum! This honor celebrates her more than two decades of leadership in quantum computing and her pivotal role advancing the world’s leading trapped-ion systems. Watch the Award Ceremony here.
Keynote with Quantinuum's CEO, Dr. Rajeeb Hazra
At QWC 2024, Quantinuum’s President & CEO, Dr. Rajeeb “Raj” Hazra, took the stage to showcase our commitment to advancing quantum technologies through the unveiling of our roadmap to universal, fully fault-tolerant quantum computing by the end of this decade. This year at QWC 2025, Raj shared the progress we’ve made over the last year in advancing quantum computing on both commercial and technical fronts and exciting insights on what’s to come from Quantinuum. Access the full session here.
Panel Session: Policy Priorities for Responsible Quantum and AI
As part of the Track Sessions on Government & Security, Quantinuum’s Director of Government Relations, Ryan McKenney, discussed “Policy Priorities for Responsible Quantum and AI” with Jim Cook from Actions to Impact Strategies and Paul Stimers from Quantum Industry Coalition.
Fireside Chat: Establishing a Pro-Innovation Regulatory Framework
During the Track Session on Industry Advancement, Quantinuum’s Chief Legal Officer, Kaniah Konkoly-Thege, and Director of Government Relations, Ryan McKenney, discussed the importance of “Establishing a Pro-Innovation Regulatory Framework”.
In the world of physics, ideas can lie dormant for decades before revealing their true power. What begins as a quiet paper in an academic journal can eventually reshape our understanding of the universe itself.
In 1993, nestled deep in the halls of Yale University, physicist Subir Sachdev and his graduate student Jinwu Ye stumbled upon such an idea. Their work, originally aimed at unraveling the mysteries of “spin fluids”, would go on to ignite one of the most surprising and profound connections in modern physics—a bridge between the strange behavior of quantum materials and the warped spacetime of black holes.
Two decades after the paper was published, it would be pulled into the orbit of a radically different domain: quantum gravity. Thanks to work by renowned physicist Alexei Kitaev in 2015, the model found new life as a testing ground for the mind-bending theory of holography—the idea that the universe we live in might be a projection, from a lower-dimensional reality.
Holography is an exotic approach to understanding reality where scientists use holograms to describe higher dimensional systems in one less dimension. So, if our world is 3+1 dimensional (3 spatial directions plus time), there exists a 2+1, or 3-dimensional description of it. In the words of Leonard Susskind, a pioneer in quantum holography, "the three-dimensional world of ordinary experience—the universe filled with galaxies, stars, planets, houses, boulders, and people—is a hologram, an image of reality coded on a distant two-dimensional surface."
The “SYK” model, as it is known today, is now considered a quintessential framework for studying strongly correlated quantum phenomena, which occur in everything from superconductors to strange metals—and even in black holes. In fact, The SYK model has also been used to study one of physics’ true final frontiers, quantum gravity, with the authors of the paper calling it “a paradigmatic model for quantum gravity in the lab.”
The SYK model involves Majorana fermions, a type of particle that is its own antiparticle. A key feature of the model is that these fermions are all-to-all connected, leading to strong correlations. This connectivity makes the model particularly challenging to simulate on classical computers, where such correlations are difficult to capture. Our quantum computers, however, natively support all-to-all connectivity making them a natural fit for studying the SYK model.
Now, 10 years after Kitaev’s watershed lectures, we’ve made new progress in studying the SYK model. In a new paper, we’ve completed the largest ever SYK study on a quantum computer. By exploiting our system’s native high fidelity and all-to-all connectivity, as well as our scientific team’s deep expertise across many disciplines, we were able to study the SYK model at a scale three times larger than the previous best experimental attempt.
While this work does not exceed classical techniques, it is very close to the classical state-of-the-art. The biggest ever classical study was done on 64 fermions, while our recent result, run on our smallest processor (System Model H1), included 24 fermions. Modelling 24 fermions costs us only 12 qubits (plus one ancilla) making it clear that we can quickly scale these studies: our System Model H2 supports 56 qubits (or ~100 fermions), and Helios, which is coming online this year, will have over 90 qubits (or ~180 fermions).
However, working with the SYK model takes more than just qubits. The SYK model has a complex Hamiltonian that is difficult to work with when encoded on a computer—quantum or classical. Studying the real-time dynamics of the SYK model means first representing the initial state on the qubits, then evolving it properly in time according to an intricate set of rules that determine the outcome. This means deep circuits (many circuit operations), which demand very high fidelity, or else an error will occur before the computation finishes.
Our cross-disciplinary team worked to ensure that we could pull off such a large simulation on a relatively small quantum processor, laying the groundwork for quantum advantage in this field.
First, the team adopted a randomized quantum algorithm called TETRIS to run the simulation. By using random sampling, among other methods, the TETRIS algorithm allows one to compute the time evolution of a system without the pernicious discretization errors or sizable overheads that plague other approaches. TETRIS is particularly suited to simulating the SYK model because with a high level of disorder in the material, simulating the SYK Hamiltonian means averaging over many random Hamiltonians. With TETRIS, one generates random circuits to compute evolution (even with a deterministic Hamiltonian). Therefore, when applying TETRIS on SYK, for every shot one can just generate a random instance of the Hamiltonain, and generate a random circuit on TETRIS at the same time. This simple approach enables less gate counts required per shot, meaning users can run more shots, naturally mitigating noise.
In addition, the team “sparsified” the SYK model, which means “pruning” the fermion interactions to reduce the complexity while still maintaining its crucial features. By combining sparsification and the TETRIS algorithm, the team was able to significantly reduce the circuit complexity, allowing it to be run on our machine with high fidelity.
They didn’t stop there. The team also proposed two new noise mitigation techniques, ensuring that they could run circuits deep enough without devolving entirely into noise. The two techniques both worked quite well, and the team was able to show that their algorithm, combined with the noise mitigation, performed significantly better and delivered more accurate results. The perfect agreement between the circuit results and the true theoretical results is a remarkable feat coming from a co-design effort between algorithms and hardware.
As we scale to larger systems, we come closer than ever to realizing quantum gravity in the lab, and thus, answering some of science’s biggest questions.
At Quantinuum, we pay attention to every detail. From quantum gates to teleportation, we work hard every day to ensure our quantum computers operate as effectively as possible. This means not only building the most advanced hardware and software, but that we constantly innovate new ways to make the most of our systems.
A key step in any computation is preparing the initial state of the qubits. Like lining up dominoes, you first need a special setup to get meaningful results. This process, known as state preparation or “state prep,” is an open field of research that can mean the difference between realizing the next breakthrough or falling short. Done ineffectively, state prep can carry steep computational costs, scaling exponentially with the qubit number.
Recently, our algorithm teams have been tackling this challenge from all angles. We’ve published three new papers on state prep, covering state prep for chemistry, materials, and fault tolerance.
In the first paper, our team tackled the issue of preparing states for quantum chemistry. Representing chemical systems on gate-based quantum computers is a tricky task; partly because you often want to prepare multiconfigurational states, which are very complex. Preparing states like this can cost a lot of resources, so our team worked to ensure we can do it without breaking the (quantum) bank.
To do this, our team investigated two different state prep methods. The first method uses Givens rotations, implemented to save computational costs. The second method exploits the sparsity of the molecular wavefunction to maximize efficiency.
Once the team perfected the two methods, they implemented them in InQuanto to explore the benefits across a range of applications, including calculating the ground and excited states of a strongly correlated molecule (twisted C_2 H_4). The results showed that the “sparse state preparation” scheme performed especially well, requiring fewer gates and shorter runtimes than alternative methods.
In the second paper, our team focused on state prep for materials simulation. Generally, it’s much easier for computers to simulate materials that are at zero temperature, which is, obviously, unrealistic. Much more relevant to most scientists is what happens when a material is not at zero temperature. In this case, you have two options: when the material is steadily at a given temperature, which scientists call thermal equilibrium, or when the material is going through some change, also known as out of equilibrium. Both are much harder for classical computers to work with.
In this paper, our team looked to solve an outstanding problem: there is no standard protocol for preparing thermal states. In this work, our team only targeted equilibrium states but, interestingly, they used an out of equilibrium protocol to do the work. By slowly and gently evolving from a simple state that we know how to prepare, they were able to prepare the desired thermal states in a way that was remarkably insensitive to noise.
Ultimately, this work could prove crucial for studying materials like superconductors. After all, no practical superconductor will ever be used at zero temperature. In fact, we want to use them at room temperature – and approaches like this are what will allow us to perform the necessary studies to one day get us there.
Finally, as we advance toward the fault-tolerant era, we encounter a new set of challenges: making computations fault-tolerant at every step can be an expensive venture, eating up qubits and gates. In the third paper, our team made fault-tolerant state preparation—the critical first step in any fault-tolerant algorithm—roughly twice as efficient. With our new “flag at origin” technique, gate counts are significantly reduced, bringing fault-tolerant computation closer to an everyday reality.
The method our researchers developed is highly modular: in the past, to perform optimized state prep like this, developers needed to solve one big expensive optimization problem. In this new work, we’ve figured out how to break the problem up into smaller pieces, in the sense that one now needs to solve a set of much smaller problems. This means that now, for the first time, developers can prepare fault-tolerant states for much larger error correction codes, a crucial step forward in the early-fault-tolerant era.
On top of this, our new method is highly general: it applies to almost any QEC code one can imagine. Normally, fault-tolerant state prep techniques must be anchored to a single code (or a family of codes), making it so that when you want to use a different code, you need a new state prep method. Now, thanks to our team’s work, developers have a single, general-purpose, fault-tolerant state prep method that can be widely applied and ported between different error correction codes. Like the modularity, this is a huge advance for the whole ecosystem—and is quite timely given our recent advances into true fault-tolerance.
This generality isn’t just applicable to different codes, it’s also applicable to the states that you are preparing: while other methods are optimized for preparing only the |0> state, this method is useful for a wide variety of states that are needed to set up a fault tolerant computation. This “state diversity” is especially valuable when working with the best codes – codes that give you many logical qubits per physical qubit. This new approach to fault-tolerant state prep will likely be the method used for fault-tolerant computations across the industry, and if not, it will inform new approaches moving forward.
From the initial state preparation to the final readout, we are ensuring that not only is our hardware the best, but that every single operation is as close to perfect as we can get it.