Jakhongir Saydaliev

EPFL | Logitech

profile.png

I research large language and vision models as an MSc student at EPFL. I’m fortunate to have worked at the NLP, DHLAB, LINX labs and also at SwissAI and Logitech. I have done my Bachelor’s at Politecnico di Torino.

Research Interests

I research building inclusive, multimodal reasoning AI systems that work for everyone. Below are some areas I’ve been working on:

  • Multilingual NLP: I want to bridge the gaps in multilingual NLP & ensure AI benefits linguistically diverse and underrepresented communities
    • ConLID: Contrastive language identification for low-resource languages
    • Apertus: The first large-scale language model developed in Switzerland
  • Multimodal Reasoning: Models need to reason across modalities, not just text, to handle real-world scenarios
  • Efficient Reasoning: As we scale to multimodal scenarios, we need computationally efficient reasoning to make deployment practical
    • Investigating the “overthinking” phenomenon in LLMs (ongoing)

Starting from Fall 2026, I am seeking a PhD position; a brief overview of my proposed work is available in this research proposal video.

News

Jan 2026 Our ConLID paper got accepted to EACL
Oct 2025 Our paper got published at Computational Humanities Research
Sep 2025 Joined Logitech as an ML Research Intern to work on Computer Use Agents

Selected Publications

  1. conlid_figure.png
    ConLID: Supervised Contrastive Learning for Low-Resource Language Identification
    Negar Foroutan*, Jakhongir Saydaliev*, Ye Eun Kim, and 1 more author
    European Chapter of the Association for Computational Linguistics (EACL), 2026
  2. venice_figure.png
    LLM Agents for Interactive Exploration of Historical Cadastre Data: Framework and Application to Venice
    Tristan Karch*, Jakhongir Saydaliev*, Isabella Di Lenardo, and 1 more author
    Computational Humanities Research (CHR), 2025
  3. apertus.png
    Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
    Apertus Team
    Submitted to Association for Computational Linguistics (ACL), 2026
    Contributed to the pre-training data through my ConLID project

Other contributions

  1. include.png
    INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
    Angelika Romanou, Negar Foroutan, Anna Sotnikova, and 54 more authors
    In International Conference on Learning Representations (ICLR), 2025
    Contributed to collecting the Uzbek dataset