The deep neural network models that power today's most sophisticated machine learning applications have grow to be so large and sophisticated that they’re pushing the bounds of conventional electronic computing hardware.
A faster and more energy-efficient alternative is photonic hardware, which may perform machine learning calculations using light. However, there are some forms of neural network calculations that a photonic device cannot perform and require using off-chip electronics or other techniques that sacrifice speed and efficiency.
Building on a decade of research, scientists at MIT and elsewhere have developed a brand new photonic chip that overcomes these obstacles. They demonstrated a totally integrated photonic processor that may perform all the important thing calculations of a deep neural network optically on chip.
The optical device was in a position to complete key calculations for a machine learning classification task in lower than half a nanosecond, achieving greater than 92 percent accuracy – a performance comparable to standard hardware.
The chip, made up of interconnected modules that form an optical neural network, is manufactured using business molding processes, which could enable the technology to be scaled up and integrated into electronics.
In the long run, the photonic processor may lead to faster and more energy-efficient deep learning for computationally intensive applications reminiscent of lidar, scientific research in astronomy and particle physics, or high-speed telecommunications.
“In many cases, it’s not only how well the model works, but in addition how quickly you get a solution. Now that now we have an end-to-end system that may run a neural network in nanosecond optics, we are able to take into consideration applications and algorithms at the next level,” says Saumil Bandyopadhyay '17, MEng '18, PhD '23, visiting scientist within the Quantum Photonics and AI Group within the Research Laboratory of Electronics (RLE) and postdoctoral fellow at NTT Research, Inc., the lead writer of a paper on the brand new chip.
Bandyopadhyay is joined within the paper by Alexander Sludds '18, MEng '19, PhD '23; Nicholas Harris PhD '17; Darius Bunandar PhD '19; Stefan Krastanov, a former RLE research scientist who’s now an assistant professor on the University of Massachusetts at Amherst; Ryan Hamerly, visiting scientist at RLE and senior scientist at NTT Research; Matthew Streshinsky, former head of silicon photonics at Nokia and now co-founder and CEO of Enosemi; Michael Hochberg, President of Periplous, LLC; and Dirk Englund, professor within the Department of Electrical Engineering and Computer Science, principal investigator of the Quantum Photonics and Artificial Intelligence Group and RLE, and senior writer of the article. The research appears today in
Machine learning with light
Deep neural networks consist of many interconnected layers of nodes, or neurons, that process input data to supply output. A key operation in a deep neural network involves using linear algebra to perform matrix multiplication, which transforms data because it passes from layer to layer.
But along with these linear operations, deep neural networks perform nonlinear operations that help the model learn more complex patterns. Nonlinear operations reminiscent of activation functions give deep neural networks the flexibility to unravel complex problems.
In 2017, Englund's group, together with researchers within the laboratory of Marin Soljačić, Cecil and Ida Green Professor of Physics, demonstrated an optical neural network on a single photonic chip that would perform matrix multiplication with light.
However, at the moment the device couldn’t perform non-linear operations on the chip. Optical data needed to be converted into electrical signals and sent to a digital processor to perform nonlinear operations.
“Nonlinearity in optics is sort of difficult because photons don’t easily interact with one another. This makes it very energy intensive to trigger optical nonlinearities, making it difficult to construct a system that may do that in a scalable way,” explains Bandyopadhyay.
They have overcome this challenge by developing devices called nonlinear optical functional units (NOFUs), which mix electronics and optics to implement nonlinear operations on chip.
The researchers built an optical deep neural network on a photonic chip using three layers of devices that perform linear and nonlinear operations.
A completely integrated network
To begin, their system encodes the parameters of a deep neural network into light. Then, a series of programmable beamsplitters, demonstrated within the 2017 paper, perform matrix multiplication on these inputs.
The data is then passed to programmable NOFUs that implement nonlinear functions by dissipating a small amount of sunshine to photodiodes that convert optical signals into electrical current. This process, which eliminates the necessity for an external amplifier, uses little or no energy.
“We stay within the optical area the entire time until we wish to read the reply at the tip. This allows us to attain extremely low latency,” says Bandyopadhyay.
Achieving such low latency allowed them to efficiently train a deep neural network on the chip, a process generally known as in-situ training, which usually consumes a considerable amount of energy in digital hardware.
“This is especially useful for systems by which optical signals are processed throughout the domain, reminiscent of in navigation or telecommunications, but in addition for systems that you desire to learn in real time,” he says.
The photonic system achieved over 96 percent accuracy in training tests and over 92 percent accuracy in inference, which is comparable to standard hardware. In addition, the chip performs essential calculations in lower than half a nanosecond.
“This work shows that computing – essentially the mapping of inputs to outputs – may be transferred to latest architectures of linear and nonlinear physics that enable a fundamentally different scaling law for computation and energy,” says Englund.
The entire circuit was manufactured using the identical infrastructure and foundry processes used to supply CMOS computer chips. This could enable large-scale manufacturing of the chip using proven techniques that introduce only a few errors into the manufacturing process.
Bandyopadhyay says scaling their device and integrating it with real-world electronics like cameras or telecommunications systems can be a key focus of future work. In addition, the researchers need to research algorithms that may benefit from optics to coach systems faster and more energy-efficiently.
This research was funded partly by the US National Science Foundation, the US Air Force Office of Scientific Research and NTT Research.