Skip to content
View yitianlian's full-sized avatar

Highlights

  • Pro

Block or report yitianlian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yitianlian/README.md

Chengxing Xie

Building LLM agents that can plan, reason, and improve through reinforcement learning.

Ph.D. student at Tsinghua University · Researcher / Builder in Agentic RL and LLM post-training


What I Do

I work on language model agents that need to operate beyond static benchmarks, with a focus on planning, reasoning, and learning under real-world system constraints.

My current interests include:

  • Agentic reinforcement learning
  • LLM post-training systems
  • Long-horizon reasoning and planning
  • Efficient infrastructure for open models

Current Roles

  • Ph.D. student in Computer Science at Tsinghua University
  • Advised by Prof. Hongning Wang
  • Intern at Zhipu AI

Previously worked with Shanghai AI Lab, HKU, and KAUST.

Open-Source Work

  • Core contributor to GLM-5
  • Core contributor to GLM-4.5
  • Core contributor to slime, an LLM post-training framework for scalable RL
  • Contributor to SGLang, mainly on RL-related capabilities

Selected Work

  • GLM-5: from vibe coding to agentic engineering
  • GLM-4.5: agentic, reasoning, and coding foundation models
  • SWE-Fixer
    • ACL 2025 Findings
  • Can Large Language Model Agents Simulate Human Trust Behavior?
    • NeurIPS 2024

A Short Version

I build practical LLM agent systems:
better reasoning, better training, better deployment.

Links

Pinned Loading

  1. THUDM/slime THUDM/slime Public

    slime is an LLM post-training framework for RL Scaling.

    Python 4.8k 633

  2. slime slime Public

    Forked from THUDM/slime

    slime is a LLM post-training framework aiming at scaling RL.

    Python