Abstract
This review explores recent advancements in linguistic profiling for age detection on social media. As digital communication becomes central to identity expression, researchers have developed models that infer users' age based on lexical, syntactic, and stylistic features. Studies across platforms like Twitter, Facebook, and Telegram show that linguistic variation often correlates with age. While machine learning approaches have shown promising accuracy, several limitations persist. These include overreliance on English-language corpora, insufficient representation of low-resource languages like Uzbek, and a lack of sociolinguistic theory integration. Ethical concerns, such as privacy and consent, are also underaddressed. This article categorizes existing methodologies, compares cross-cultural findings, and identifies contradictions in empirical results. It highlights the need for more inclusive, longitudinal, and ethically grounded approaches to age profiling. By outlining current gaps and future directions, this review contributes to the development of fair, transparent, and linguistically informed systems for age detection in digital contexts.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2025 International Journal of World Languages