Decoding the Encoded â€“ Linguistic Secrets of Language Models: A Systematic Literature Review

H. Avetisyan and D. Broneske, The German Centre for Higher Education Research and Science Studies (DZHW), Germany; H. Avetisyan and D. Broneske, The German Centre for Higher Education Research and Science Studies (DZHW), Germany

Decoding the Encoded â€“ Linguistic Secrets of Language Models: A Systematic Literature Review

Authors

H. Avetisyan and D. Broneske, The German Centre for Higher Education Research and Science Studies (DZHW), Germany

Abstract

Language models' growing role in natural language processing necessitates a deeper understanding of their linguistic knowledge. Linguistic probing tasks have become crucial for model explainability, designed to evaluate models' understanding of various linguistic phenomena. Objective: This systematic review critically assesses the linguistic knowledge of language models via linguistic probing, providing a comprehensive overview of the understood linguistic phenomena and identifying future research areas. Method: We performed an extensive search of relevant academic databases and analyzed 57 articles published between October 2018 and October 2022. Results: While language models exhibit extensive linguistic knowledge, limitations persist in their comprehension of specific phenomena. The review also points to a need for consensus on evaluating language models' linguistic knowledge and the linguistic terminology used. Conclusion: Our review offers an extensive look into linguistic knowledge of language models through linguistic probing tasks. This study underscores the importance of understanding these models' linguistic capabilities for effective use in NLP applications and for fostering more explainable AI systems.

Keywords

LLMs, linguistic knowledge, probing, analysis of LMs.

CS&IT Conference Proceedings

Decoding the Encoded â€“ Linguistic Secrets of Language Models: A Systematic Literature Review