Plastic polymer cables competing with optical fiber-IEEE Spectrum

2021-12-14 14:18:20 By : Mr. Aaron Wu

The IEEE website places cookies on your device in order to provide you with the best user experience. By using our website, you agree to the placement of these cookies. To learn more, please read our privacy policy.

How fast is the data flow? The answer is: not fast enough.

The search for more efficient data transmission solutions to meet the ever-increasing computing needs is endless. Even today, most data transmissions are carried out through traditional copper cables, which are very power-hungry, resulting in a trade-off between data exchange and energy consumption. Fiber optic cables are an alternative method, but they do not work well with the silicon chips in our computing systems. Although it is theoretically possible to overcome these limitations, the result may be very expensive, especially for the rich applications of electronic products such as data centers, spacecraft, and electric vehicles.

A group of scientists from the Massachusetts Institute of Technology recently demonstrated a plastic polymer cable as a complementary solution; it takes full advantage of the advantages of copper wire and optical fiber, while discarding their disadvantages. This kind of cable is thinner and lighter than copper, and its data transmission speed is comparable to that of fiber optic cable, and it is compatible with silicon chips. The team presented its research results at the IEEE International Solid-State Circuits Conference in February, reporting data transmission speeds of up to 100 gigabits per second.

Ruonan Han, one of the MIT researchers, said that it is becoming more and more difficult to efficiently transmit more data via copper cables. He added that for low-speed USB devices such as mice and keyboards, the cables are clearly thin, low-cost to manufacture, and long enough. "But with newer standards aimed at achieving higher data rates, we are seeing cables become thicker, more expensive, and often shorter [due to] technical challenges. We hope this research will [achieve] higher speeds To meet our needs."

The research, partially funded by Intel, Raytheon, Naval Research Laboratory, and Naval Research Office, uses Asia-Pacific Hertz high-frequency electromagnetic signals to demonstrate the efficiency of polymer cables. Han and his colleagues designed low-cost silicon chips to pair with polymer wires, which were laser cut from polymer chips. In the experiment, the chipset was connected by a polymer tape 30 cm long and 0.4 mm wide. This hair-thin polymer tape can transmit data on three channels at a speed of 105 GB per second, its energy efficiency is much higher than that of copper, and its connection with silicon chips is clean and neat.

"We envision that a bundle of such a thin ribbon will further greatly increase the speed," Han said. "[Data transfer speed] is actually not theoretically limited." In addition to increasing the number of channels, the MIT team is also investigating other ways to increase data transfer rates; Han said they are optimistic that they will soon cross the 1 TB per second barrier . Although they were restricted to 30 cm cables by the laboratory setting, he said that reaching 1 meter and above "should not be a big problem."

Although it is still too early, Han believes that the industry has an opportunity to adopt the polymer cable of the Massachusetts Institute of Technology team as the new standard for future data links. Polymer tapes have previously been used as waveguides to propagate electromagnetic signals, but the direct compatibility with silicon chips does not require any special manufacturing, which is a boost to the efforts of the MIT team. "[This] makes it easier to continue using existing silicon integrated circuit technology," Han said.

He hopes that these features will help the industry to adopt, although it is not easy, "because it requires establishing industry standards, developing integrated circuit chips, manufacturing polymer cables, connectors, etc." Despite these challenges, Han is still optimistic. , Their successful demonstration will attract the attention of materials scientists and engineers, who will improve the quality and manufacturability of cables.

This computer rendering depicts the pattern on the photonic chip designed by the author and his colleagues to use light to perform neural network calculations.

Think about the many tasks that required human intuition to apply computers in the not-too-distant past. Computers usually recognize objects in images, transcribe speech, translate between languages, diagnose medical conditions, play complex games, and drive cars.

The technology that contributed to these amazing developments is called deep learning, and this term refers to a mathematical model called an artificial neural network. Deep learning is a subfield of machine learning, a branch of computer science based on fitting complex models to data.

Although machine learning has been around for a long time, deep learning has recently started its own life. The reason for this is mainly because more and more computing power has become widely available-and the large amount of data that can be easily collected and used to train neural networks.

At the turn of the millennium, when graphics processing units (GPUs) began to be used for non-graphic computing, computing power within reach of people began to advance by leaps and bounds, and this trend has become more and more common in the past decade. But the computational demands of deep learning are growing faster. This dynamic has prompted engineers to develop electronic hardware accelerators specifically for deep learning. Google's tensor processing unit (TPU) is a good example.

Here, I will describe a very different way to solve this problem-the use of optical processors to perform neural network calculations with photons instead of electrons. To understand how optics works here, you need to have an understanding of how computers currently perform neural network calculations. So when I outline what's happening under the hood, please be patient.

Artificial neurons are almost always constructed using special software running on some kind of digital electronic computer. The software provides multiple inputs and one output for a given neuron. The state of each neuron depends on the weighted sum of its inputs, and a non-linear function (called the activation function) is applied to that input. As a result, the output of this neuron then becomes the input of various other neurons.

For computational efficiency, these neurons are grouped into layers, and neurons are only connected to neurons in adjacent layers. Contrary to allowing connections between any two neurons, the advantage of arranging things in this way is that it allows certain mathematical techniques of linear algebra to be used to speed up calculations.

Although they are not all, these linear algebra calculations are the most computationally demanding part of deep learning, especially as the network size grows. This is true for both training (the process of determining what weight to apply to each neuron's input) and inference (when the neural network provides the desired result).

What are these mysterious linear algebra calculations? They are really not that complicated. They involve operations on matrices, which are just rectangular arrays of numbers-if you want, you can use a spreadsheet, minus the descriptive column headings you might find in a typical Excel file.

This is good news, because modern computer hardware has been well optimized for matrix operations. Long before deep learning became popular, matrix operations were the basis of high-performance computing. The correlation matrix calculation of deep learning comes down to a large number of multiplication and accumulation operations, that is, multiplying pairs of numbers and adding their products.

Two beams with electric fields proportional to the numbers x and y to be multiplied hit the beam splitter (blue square). The beam leaving the beam splitter illuminates the photodetector (ellipse), which provides an electrical signal proportional to the square of these electric fields. Inverting the signal of one photodetector and adding it to the other produces a signal proportional to the product of the two inputs. David Schneider

Over the years, deep learning requires more and more multiplication and accumulation operations. Take LeNet as an example, which is a pioneering deep neural network designed to classify images. In 1998, it proved to be superior to other machine technologies in recognizing handwritten letters and numbers. But by 2012, AlexNet is a neural network, its multiplication and accumulation operation is 1,600 times that of LeNet, and it can recognize thousands of different types of objects in images.

From the initial successful development of LeNet to AlexNet, it is necessary to increase the computing performance by nearly 11 times. During these 14 years, Moore's Law provided most of the growth. The challenge now is to maintain this trend, because Moore's Law has lost momentum. The usual solution is to invest more computing resources and time, money and energy on the problem.

Therefore, training today's large neural networks usually produces a significant environmental footprint. For example, a study in 2019 found that training a deep neural network for natural language processing produces five times the carbon dioxide emissions produced by driving a car during its entire life cycle.

To be sure, improvements in digital electronic computers have enabled deep learning to flourish. But this does not mean that the only way to perform neural network calculations is to use such machines. Decades ago, when digital computers were relatively primitive, some engineers used analog computers to solve difficult calculations. With the improvement of digital electronic technology, those analog computers were eliminated. But now may be the time to adopt this strategy again, especially when simulation calculations can be done optically.

As we all know, optical fibers can support much higher data rates than wires. This is why since the late 1970s, all long-distance communication lines have adopted optical fibers. Since then, optical data links have replaced copper wires, and their spans have become shorter and shorter, all the way to rack-to-rack communication in the data center. Optical data communication is faster and consumes less power. Optical computing promises the same advantages.

But there is a big difference between communicating data and calculating with it. This is where the analog optics approach encounters obstacles. Traditional computers are based on transistors, which are highly non-linear circuit elements-which means that their output is not only proportional to their input, at least when used in calculations. Non-linearity is what makes transistors turn on and off, allowing them to be molded into logic gates. This kind of switching is easy to complete with electronic equipment, because the non-linearity of electronic equipment is very important. But photons follow Maxwell's equation, which is annoyingly linear, which means that the output of an optical device is usually proportional to its input.

The trick is to use the linearity of optical equipment to do one thing that deep learning relies on most: linear algebra.

To illustrate how to do this, I will describe here a photonic device that can multiply two matrices when coupled with some simple analog electronic devices. This multiplication combines the rows of one matrix with the columns of another matrix. More precisely, it multiplies the number pairs in these rows and columns, and adds their products together-the multiplication and accumulation operations I described earlier. My MIT colleague and I published a paper on how to do this in 2019. We are now working hard to build such an optical matrix multiplier.

The basic computing unit in this device is an optical element called a beam splitter. Although its makeup is actually more complicated, you can think of it as a 45-degree half-silvered mirror. If you send a beam of light into it from the side, the beam splitter will allow half of the light to pass directly through it, while the other half will be reflected from the angled mirror, making it bounce at 90 degrees from the incident beam.

Now illuminate the second beam of light perpendicular to the first beam into the beam splitter so that it hits the other side of the angled mirror. Half of this second beam will similarly be transmitted and reflected at a 90 degree angle. The two output beams will be combined with the two outputs of the first beam. So this beam splitter has two inputs and two outputs.

To use this device for matrix multiplication, you need to generate two beams whose electric field strength is proportional to the two numbers to be multiplied. We call these field strengths x and y. The two beams of light are irradiated into the beam splitter, and the beam splitter will combine the two beams of light. This special beam splitter produces two outputs with electric field values ​​of (x + y)/√2 and (x − y)/√2.

In addition to the beam splitter, this analog multiplier also requires two simple electronic components-photodetectors-to measure the two output beams. However, they do not measure the electric field strength of these beams. They measure the power of the beam, which is proportional to the square of its electric field strength.

Why is this relationship important? To understand this, some algebra is needed-but not beyond what you learned in high school. Recall that when you square (x + y)/√2, you get (x2 + 2xy + y2)/2. When you square (x − y)/√2, you will get (x2 − 2xy + y2)/2. Subtract the latter from the former to get 2xy.

Now stop and think about the importance of this simple math. This means that if you encode one number as a beam with a certain intensity, encode the other number as a beam of another intensity, send them through such a beam splitter, measure the two outputs with a photodetector, and cancel them Before adding them together, you will get a signal proportional to the product of the two numbers.

The simulation of the integrated Mach-Zehnder interferometer found in Lightmatter’s neural network accelerator shows three different conditions, namely that the light propagating in the two branches of the interferometer undergoes different relative phase shifts (a is 0 degrees, b Is 45 degrees and c is 90 degrees). Light matter

My description sounds as if each of these beams must remain stable. In fact, you can briefly pulse light in two input beams and measure the output pulse. Even better, you can feed the output signal into a capacitor, and then as long as the pulse lasts, it will accumulate charge. You can then pulse the input again for the same duration, this time encoding the two new numbers to be multiplied. Their product adds some charge to the capacitor. You can repeat this process as many times as needed, each time performing another multiplication and accumulation operation.

Using pulsed light in this way allows you to perform many such operations in a fast sequence. The most energy-consuming part is reading the voltage on the capacitor, which requires an analog-to-digital converter. But you don't have to do this after every pulse-you can wait until the end of a series, such as N pulses. This means that the device can perform N multiplication and accumulation operations with the same energy to read the answer whether N is small or large. Here, N corresponds to the number of neurons in each layer of the neural network, which can easily reach thousands. So this strategy consumes very little energy.

Sometimes, you can also save energy on the input side. This is because the same value is usually used as input for multiple neurons. Rather than converting this number into light multiple times—consuming energy each time—it can be converted only once, and the resulting beam can be divided into many channels. In this way, the energy cost of input conversion can be shared among many operations.

Splitting a beam of light into multiple channels does not need to be more complicated than a lens, but putting the lens on the chip can be tricky. Therefore, the device that we are developing to perform neural network calculations optically may eventually become a hybrid that combines a highly integrated photonic chip with a single optical component.

I'm here to outline the strategy that my colleagues and I have been pursuing, but there are other ways to skin an optical cat. Another promising solution is based on something called a Mach-Zehnder interferometer, which combines two beam splitters and two total reflection mirrors. It can also be used to perform matrix multiplication optically. Two MIT startups, Lightmatter and Lightelligence, are developing optical neural network accelerators based on this method. Lightmatter has built a prototype that uses the optical chip it manufactures. The company expects to start selling optical accelerator boards using the chip later this year.

Another startup that uses optics for computing is Optalysis, which hopes to revive a fairly old concept. As early as the 1960s, one of the first uses of optical computing was to process synthetic aperture radar data. A key part of the challenge is to apply a mathematical operation called the Fourier transform to the measurement data. Digital computers at that time have been working hard to solve these problems. Even now, applying the Fourier transform to large amounts of data can be computationally intensive. But the Fourier transform can be performed optically, requiring only one lens. This is how engineers have processed synthetic aperture data for many years. Optalysis hopes to update this method and apply it more widely.

There is also a company called Luminous, which was spun off from Princeton University. The company is working to create a spike neural network based on laser neurons. Spike neural networks more closely mimic the way biological neural networks work and, like our own brains, can use very little energy for calculations. Luminous's hardware is still in the early stages of development, but the promise of combining two energy-saving methods (spikes and optics) is very exciting.

Of course, there are still many technical challenges to overcome. One is to improve the accuracy and dynamic range of analog optical calculations, which are far from the effects that digital electronic devices can achieve. This is because these optical processors are affected by various noise sources, and the accuracy of the digital-to-analog converters and analog-to-digital converters used to input and output data is limited. In fact, it is difficult to imagine an optical neural network with an operating accuracy of more than 8 to 10 bits. Although 8-bit electronic deep learning hardware exists (Google TPU is a good example), this industry requires higher precision, especially neural network training.

There are also difficulties in integrating optical components on the chip. Since these components are tens of micrometers in size, they cannot be packaged as tightly as transistors, so the required chip area will increase rapidly. A demonstration of this method by MIT researchers in 2017 involved a chip with a side length of 1.5 mm. Even the largest chip does not exceed a few square centimeters, which limits the size of the matrix that can be processed in parallel in this way.

Photonics researchers tend to solve many other problems in computer architecture. But it is clear that, at least in theory, photonics has the potential to accelerate deep learning by several orders of magnitude.

Based on the technologies currently available for various components (light modulators, detectors, amplifiers, analog-to-digital converters), it is reasonable to assume that the energy efficiency of neural network calculations can be 1,000 times higher than that of today's electronic processors. For emerging optical technologies Make more radical assumptions, this factor could be as high as one million. Due to the limited power of electronic processors, these energy efficiency improvements are likely to translate into corresponding speed improvements.

Many concepts in analog optics calculations are decades old. Some even predate silicon computers. In the 1970s, the optical matrix multiplication and even optical neural network schemes were first demonstrated. But this method did not become popular. Will it be different this time? Probably, for three reasons.

First of all, deep learning is now really useful, not just academic curiosity. Second, we cannot rely solely on Moore's Law to continue to improve electronic products. Finally, we have a new technology that was not available in previous generations: integrated photonics. These factors indicate that optical neural networks will really come this time-and the future of this kind of computing may indeed be photonic.