By Bob Coecke and Dimitri Kartsaklis
Introduction
Today we announce the next generation of λambeq , Quantinuum’s quantum natural language processing (QNLP) package.
Incorporating recent developments in both quantum NLP and quantum hardware, λambeq Gen II allows users not only to model the semantics of natural language (in terms of vectors and tensors), but to convert linguistic structures and meaning directly into quantum circuits for real quantum hardware.
Five years ago, our team reported the first realization of Quantum Natural Language Processing (QNLP). In their work, the team realized that there is a direct correspondence between the meanings of words and quantum states, and between grammatical structures and quantum entanglement. As that article put it: “Language is effectively quantum native”.
Our team realized an NLP task on quantum hardware and provided the data and code via a GitHub repository, attracting the interest of a then-nascent quantum NLP community, which has since grown around successive releases of λambeq. We released it 18 months later, supported by a research paper on the arXiv.
Λambeq: an open-source python library that turns sentences into quantum circuits, and then feeds these to quantum computers subject to VQC methodologies. Initial release in October 2021 arXiv:2110.04236
From that moment onwards, anyone could play around with QNLP on the then freely available quantum hardware. Our λambeq software has been downloaded over 50,000 times, and the user community is supported by an active Discord page, where practitioners can interact with each other and with our development team.
The QNLP Back-Story
In order to demonstrate that QNLP was possible, even on the hardware available in 2021, we focused exclusively on small noisy quantum computers. Our motivation was to produce some exploratory findings, looking for a potential quantum advantage for natural language processing using quantum hardware. We published our original scientific work in 2016, detailing a quadratic speedup over classical computers (in certain circumstances). We are strongly convinced that there is a lot more potential than indicated in that paper.
That first realization of QNLP marked a shift away from brute-force machine learning, which has now taken the world by storm in the shape of large language models (LLMs) running on algorithms called “transformers”.
Instead of the transformer approach, we decoded linguistic structure using a compositional theory of meaning. With deep roots in computational linguistics, our approach was inspired by research into compositional linguistic algorithms, and their resemblance to other quantum primitives such as quantum teleportation. As we continued our work, it became clear that our approach reduced training requirements by relying on a natural relationship between linguistic structure and quantum structure, offering near-term QNLP in practice.
Embedding recent progress in λambeq Gen II
We haven’t sat still, and neither have the teams working in the field of quantum hardware. Quantinuum’s stack now performs at a level we only dreamed of in 2020. While we look forward to continued progress on the hardware front, we are getting ahead of these future developments by shifting the focus in our algorithms and software packages, to ensure we and λambeq’s users are ready to chase far more ambitious goals!
We moved away from the compositional theory of meaning that was the focus of our early experiments, called DisCoCat, to a new mathematical foundation called DisCoCirc. This enabled us to explore the relationship between text generation and text circuits, concluding that “text circuits are generative for text”.
Formally speaking, DisCoCirc embraces substantially more compositional structure present in language than DisCoCat does, and that pays off in many ways:
- Firstly, the new theoretical backbone enables one to compose the structure of sentences into text structure, so we can now deal with large texts.
- Secondly, the compositional structure of language is represented in a compressed manner, that, in fact, makes the formalism language-neutral, as reported in this blog post.
- Thirdly, the augmented compositional linguistic structure, together with the requirement of learnability, makes a quantum model now canonical, and we now have solid theoretical evidence for genuine enhanced performance on quantum hardware, as shown in this arXiv paper.
- Fourthly, the problems associated with trainability of quantum machine learning models vanish, thanks to compositional generalization, which was the subject of this paper.
- Lastly, and surely not least, we reported on the achievement of compositional interpretability and explored the myriad ways that it supports explainable AI (XAI), which we also discussed extensively in this blog post.
Today, our users can benefit from these recent developments with the release λambeq Gen II. Our open-source tools have always benefited from the attention and feedback we receive from our users. Please give it a try, and we look forward to hearing your feedback on λambeq Gen II.
Enjoy!