Due to the slowdown or even failure of Moore's Law and the limitation of electrical response rate, artificial neural networks based on traditional von-Neumann architecture are unable to meet the requirements of large-capacity and low-latency data processing. Optical neural network realizes neuromorphic computing on an optical hardware platform. With the advantages of low power consumption and parallel operation, it can greatly improve the data processing efficiency of artificial neural network. However, the optical realization of nonlinear activation function remains a tremendous challenge.
Current optical nonlinear activation functions (OAFs) are divided into electro-optical schemes and all-optical schemes. The response rate of electro-optical scheme is inevitably limited by the response of photoelectric conversion device. At the same time, excess energy is consumed due to the need for extra power. Because of the weak nonlinearity of silicon materials, high power and long interaction lengths are often required to achieve on-chip nonlinearity. Current all-optical schemes on silicon platform are based on microring resonators, which results in a small operating bandwidth of the device and limits the information processing capacity of optical neural networks. Therefore, it is very necessary to realize a silicon-based all-optical OAF with large bandwidth, low power consumption and high speed.
Recently, Professor Dong Jianji, from Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, proposed and experimentally proved a silicon-based all-optical nonlinear activation function based on a Ge-Si hybrid asymmetric coupler (as shown in Figure 1).
In the communication band, Ge is often used as an absorber for photodetectors on silicon platforms. It is also reported that the plasmonic dispersion effect of germanium films at 1550 nm is stronger than that of silicon due to the biaxial compressive strain caused by the lattice mismatch of germanium and silicon. Utilizing the intrinsic absorption effect and plasmonic dispersion effect of germanium at 1550 nm, the refractive index of germanium changes with the increase of input power, the coupling coefficient between waveguides is changed at the same time, and finally the transmittance of the device is altered, thereby realizing nonlinear activation.Affected by the plasma dispersion effect of germanium, the transmission spectrum of the coupler changes significantly, and the experimentally measured activation function at 1550 nm exhibits a saturation effect with a loss of 4.28 dB, a threshold power of 5.1 mW, and a response rate of 70 MHz. It is based on a directional coupler and has a large operating bandwidth (as shown in Figures 2 and 3). Applying it to the MNIST handwritten digit set classification task, the OAF has a similar performance to the traditional tanh function, with the highest classification accuracy reaching 99.3%, which proves that it can achieve similar accuracy to the traditional tanh or ReLU function. The device is simple in structure, easy to fabricate and integrate, and is expected to be used in future optical neural networks.
Figure 1. (a) Three-dimensional schematic diagram of a germanium-silicon hybrid asymmetric directional coupler. (b) SEM image of the device
Figure 2. (a) Experimental spectrum with different power input. (b) Net gain. (c) Measured OAF
Figure 3. Response rate test results
Figure 4. MNIST handwritten digit set classification task test results
Relevant research results were recently published in IEEE Journal of Selected Topics in Quantum Electronics.
Full text can be viewed at: https://ieeexplore.ieee.org/document/9756290
[1] H. Li, B. Wu, W. Tong, J. Dong and X. Zhang, "All-Optical Nonlinear Activation Function Based on Germanium Silicon Hybrid Asymmetric Coupler," IEEE Journal of Selected Topics in Quantum Electronics, doi: 10.1109/JSTQE.2022.3166510.