Vision Transformer network for optical overlay metrology on semiconductor wafers

Publication date
DOI http://dx.doi.org/10.1063/5.0301749
Reference L. de Wolf, M. Lipp, M. Cochez, A.J. den Boef and L.V. Amitonova, Vision Transformer network for optical overlay metrology on semiconductor wafers, APL Mach. Learn. 4, (1), 016101: 1-10 (2026)
Group Nanoscale Imaging and Metrology

Fast and high-precision wafer metrology is critical for the semiconductor industry. In this work, we explore the use of simple and cost-effective optical sensors in combination with data-driven algorithms. We propose and compare three data-driven approaches with varying complexity that can directly infer sub-nanometer metrology parameters from low-numerical-aperture optical coherent microscope images with the focus on precision, noise robustness, and data efficiency. In particular, we apply Vision Transformers (ViTs), Convolutional Neural Networks, and Multilayer Perceptrons to simulated datasets with varying aberrations. We report sub-nanometer measurement accuracy and precision for all models in the presence of strong optical aberrations and noise also. Furthermore, we find that ViTs consistently achieve low errors and excel under limited data regimes compared to other models.