Jakhongir Saydaliev

EPFL | Logitech

profile.png

I research large language and vision models as an MSc student at EPFL. I’m fortunate to have worked at the NLP, DHLAB, LINX labs and also at SwissAI and Logitech. I have done my Bachelor’s at Politecnico di Torino.

Research Interests

I research building inclusive, multimodal reasoning AI systems that work for everyone. Below are some areas I’ve been working on:

  • Multilingual NLP: I want to bridge the gaps in multilingual NLP & ensure AI benefits linguistically diverse and underrepresented communities
    • ConLID: Contrastive language identification for low-resource languages
    • Apertus: The first large-scale language model developed in Switzerland
  • Multimodal Reasoning: Models need to reason across modalities, not just text, to handle real-world scenarios
  • Efficient Reasoning: As we scale to multimodal scenarios, we need computationally efficient reasoning to make deployment practical
    • Investigating the “overthinking” phenomenon in LLMs (ongoing)

Starting from Fall 2026, I am seeking a PhD position; a brief overview of my proposed work is available in this research proposal video.

News

Sep 2025 Joined Logitech as an ML Research Intern to work on Computer Use Agents
Jun 2025 Joined SwissAI to work on reasoning for vision language models through reinforcement learning
May 2025 Won the 2nd place in a hackathon on efficient LLM training [code]

Selected Publications

  1. conlid_figure.png
    ConLID: Supervised Contrastive Learning for Low-Resource Language Identification
    Negar Foroutan*, Jakhongir Saydaliev*, Ye Eun Kim, and 1 more author
    2025
    Under review at EACL 2026; Highest score on the WMDQS Shared Task #2 at COLM 2025
  2. venice_figure.png
    LLM Agents for Interactive Exploration of Historical Cadastre Data: Framework and Application to Venice
    Tristan Karch*, Jakhongir Saydaliev*, Isabella Di Lenardo, and 1 more author
    Computational Humanities Research, 2025
  3. apertus.png
    Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
    Apertus Team
    2025
    Contributed to the pre-training data through my ConLID project

Other contributions

  1. include.png
    INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
    Angelika Romanou, Negar Foroutan, Anna Sotnikova, and 54 more authors
    In , 2025
    Contributed to collecting the Uzbek dataset