Hello! I'm Zhenting Qi

I am a first year Master’s student in Computational Science Engineering at Harvard University. Prior to Harvard, I received dual bachelor’s degree in Computer Engineering from UIUC and ZJU with highest honors.

Previously, I worked as research intern at Yale (2022-2023) with Professor Dragomir R. Radev, and at UIUC (2022) with Professor Volodymyr Kindratenko.

I am open to research or internship opportunities! Please contact me at zhentingqi[AT]g[DOT]harvard[DOT]edu.


Research

My research interest lies in Natural Language Processing (NLP). My long-term research goal is to build intelligent and reliable AI systems for the benefit of human society. Motivated by this goal, I am currently interested in the following topics (w/o particular order)

1) Reasoning: Why do current reasoning paradigms work? How can we design AIs to perform human-like reasoning?

2) Safety: How do we enhance the controllability, reliability, and robustness of AI? How can we design scalable method to ensure AIs’ safety while we are improving their capabilities?

For more information, please see Google Scholar, Semantic Scholar and DBLP.

News


Publications (Selected)

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

ICLR 2024 Workshop on Navigating and Addressing Data Problems for Foundation Models (DPFM)

We show that an adversary can exploit LMs' instruction-following capabilities to easily extract text data verbatim from the datastore of RAG systems built with instruction-tuned LMs via prompt injection.

PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching

PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching

EMNLP 2023, Industry Track Oral Presentation

We improve LoRA-finetuned LLMs with a prompt matching framework and reached performance on par with full SFT.

QTSumm: A New Benchmark for Query-Focused Table Summarization

QTSumm: A New Benchmark for Query-Focused Table Summarization

EMNLP 2023

We introduce a new benchmark named QTSUMM for query-focused table summarization, which contains 7,111 human-annotated query-summary pairs over 2,934 tables covering diverse topics.

SaFER: A Robust and Efficient Framework for Finetuning BERT-based Classifier with Noisy Labels

SaFER: A Robust and Efficient Framework for Finetuning BERT-based Classifier with Noisy Labels

ACL 2023, Industry Track

We propose a robust and efficient fine-tuning framework for BERT-based text classifiers, combating label noises without access to any clean data for training or validation.

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

EACL 2023, Short Paper Oral Presentation

LOFT utilizes logic forms as fact verifiers and content planners to control Logical Table-to-Text generation.

FOLIO: Natural Language Reasoning with First-Order Logic

FOLIO: Natural Language Reasoning with First-Order Logic

Preprint 2022

We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations.


Industry

Research Assistant in LLMs, Microsoft Research Asia (Beijing)

Research Assistant in LLMs, Microsoft Research Asia (Beijing)

Oct 2023 - Present

Researched on LLM Reasoning.

Research Intern in AI Algorithms, INF Technology (Shanghai)

Research Intern in AI Algorithms, INF Technology (Shanghai)

Feb 2022 - Jun 2023

Researched on LLM in industrial applications.


Miscellaneous

I enjoy playing basketball, reading, cooking, movies, and I am a huge fan of guitar and music!