About Me⌗
I am principle scientist at Amazon Generative Foundation Modeling Team. Before I joined Amazon, I obtained my Ph.D. degree in Machine Learning from the School of Industrial and Systems Engineering (ISyE) at Georgia Tech. I spent wonderful years with Prof. Tuo Zhao in the FLASH (Foundations of LeArning Systems for alcHemy) research group. I received my B.S. degree in Computer Science and Mathematics from the School of the Gifted Young at University of Science and Technology of China (USTC).
My research focuses on generative AI, large language modeling, deep learning and open-source software for data analysis.
Hiring Intern and Full-time⌗
We have research intern / fulltime oppenings in Generative AI. If you are interested, shoot me an email.
Professional Experience⌗
-
Principle Applied Scientist, Amazon, 2021-present
-
Research Intern, Amazon, 2020 Fall
-
Research Intern, Google AI, 2020 Summer
-
Research Intern, Microsoft, 2019 Summer
Education⌗
-
Ph.D. in Machine Learning, Georgia Institue of Technology, ISyE, 2017-2021
-
B.S. in Mathematics and Computer Science, University of Science and Technology of China, ScGY, 2013-2017
Research
Preprints And Working Papers⌗
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond 2023
Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu [arXiv] [Code]
Evolutionary Tree of LLMs (credits to Hongye Jin) [Github]
Publications⌗
-
Knowledge-selective pretraining for attribute value extraction 2023
Hui Liu, Qingyu Yin, Zhengyang Wang, Chenwei Zhang, Haoming Jiang, Yifan Gao, Zheng Li, Xian Li, Chao Zhang, Bing Yin, William Wang, Xiaodan Zhu
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 -
SST: Semantic and structural transformers for hierarchy-aware language models in e-commerce 2023
Karan Samel, Houyu Zhang, Jun Ma, Haoming Jiang, Qing Ping, sheng wang, Yi Xu, Belinda Zeng, Trishul Chilimbi
IEEE BigData, 2023 -
Graph Reasoning for Question Answering with Triplet Retrieval 2023
Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifen Yan, Chao Zhang and Bing Yin
Annual Meeting of the Association for Computational Linguistics (ACL), 2023 -
Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites 2023
Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang and Tuo Zhao
Annual Meeting of the Association for Computational Linguistics (ACL), 2023 -
Amazon-m2: A multilingual multi-locale shopping session dataset for recommendation and text generation 2023
Wei Jin, Haitao Mao, Zheng Li, Haoming Jiang, Chen Luo, Hongzhi Wen, Haoyu Han, Hanqing Lu, Zhengyang Wang, Ruirui Li, Zhen Li, Monica Xiao Cheng, Rahul Goutam, Haiyang Zhang, Karthik Subbian, Suhang Wang, Yizhou Sun, Jiliang Tang, Bing Yin, Xianfeng Tang
KDD Cup, 2023 -
LightToken: a Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models 2023
Haoyu Wang, Ruirui Li, Haoming Jiang, Zhengyang Wang, Xianfeng Tang, Bin Bi, Monica Cheng, Bing Yin, Yaqing Wang, Tuo Zhao, Jing Gao
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023 -
SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process 2023
Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao and Hongyuan Zha
International Conference on Machine Learning (ICML), 2023 -
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers 2023
Chen Liang, Haoming Jiang, Zheng Li, Xianfeng Tang, Bing Yin, and Tuo Zhao [arXiv]
International Conference on Learning Representations (ICLR), 2023 -
AutoGDA: Automated Graph Data Augmentation for Node Classification 2022
Tong Zhao, Xianfeng Tang, Danqing Zhang, Haoming Jiang, Nikhil Rao, Yiwei Song, Pallav Agrawal, Karthik Subbian, Bing Yin, and Meng Jiang [PDF]
The Learning on Graphs Conference (LoG), 2022 -
Query Attribute Recommendation at Amazon Search 2022
Chen Luo, William Headean, Neela Avudaiappan, Haoming Jiang, Tianyu Cao, Qingyu Yin, Yifan Gao, Zheng Li, Rahul Goutam, Haiyang Zhang, Bing Yin [PDF]
The ACM Conference on Recommender System (RecSys), 2022 -
Condensing Graphs via One-Step Gradient Matching 2022
Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Ying Bin [arXiv]
Proceedings of 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), 2022 -
SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models 2022
Jingfeng Yang, Haoming Jiang, Qingyu Yin, Danqing Zhang, Bing Yin, Diyi Yang [arXiv]
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 -
Self-Training with Differentiable Teacher 2022
Simiao Zuo*, Yue Yu*, Chen Liang, Haoming Jiang, Siawpeng Er, Chao Zhang, Tuo Zhao and Hongyuan Zha [arXiv]
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 -
Multilingual knowledge graph completion with self-supervised adaptive graph alignment 2022
Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang [arXiv]
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
News: We will present our findings in Knowledge Graph Meetup, June 23 2022 -
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models 2022
Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen and Tuo Zhao [arXiv]
International Conference on Learning Representations (ICLR), 2022 -
Nonparametric Regression on Low-Dimensional Manifolds using Deep ReLU Networks 2021+
Minshuo Chen, Haoming Jiang, Wenjing Liao and Tuo Zhao [arXiv]
Information and Inference: A Journal of the IMA, 2021+ -
Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach 2021
Haoming Jiang, Bo Dai, Mengjiao Yang, Tuo Zhao and Wei Wei [arXiv] [Code]
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
News: We are organizing The First Workshop on Evaluations and Assessments of Neural Conversation Systems (EANCS) Co-located with EMNLP 2021 Share Task on Dialogue OPE: Website -
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach 2021
Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen and Tuo Zhao [arXiv] [Code]
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 -
ARCH: Efficient Adversarial Regularized Training with Caching 2021
Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen and Tuo Zhao [arXiv] [Code]
Conference on Empirical Methods in Natural Language Processing (EMNLP), Findings, 2021 -
Token-wise Curriculum Learning for Neural Machine Translation 2021
Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao and Tuo Zhao [arXiv] [Code]
Conference on Empirical Methods in Natural Language Processing (EMNLP), Findings, 2021 -
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data 2021
Haoming Jiang, Danqing Zhang, Tianyu Cao, Bing Yin and Tuo Zhao [arXiv] [Code]
Annual Meeting of the Association for Computational Linguistics (ACL), 2021 -
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization 2021
Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao and Weizhu Chen [arXiv] [Code]
Annual Meeting of the Association for Computational Linguistics (ACL), 2021 -
Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach 2021
Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao and Chao Zhang [arXiv] [Code]
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021 -
Learning to Defense by Learning to Attack 2021
Haoming Jiang*, Zhehui Chen*, Yuyang Shi, Bo Dai, and Tuo Zhao (* Equal Contribution) [arXiv] [Code]
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS), 2021 -
Calibrated Fine-Tuning for Pre-trained Language Models via Manifold Smoothing 2020
Lingkai Kong, Haoming Jiang, Yuchen Zhuang, Jie Lyu, Tuo Zhao and Chao Zhang [arXiv] [Code]
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020 -
Deep Reinforcement Learning with Smooth Policy 2020
Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao [arXiv]
International Conference on Machine Learning (ICML), 2020 -
Transformer Hawkes Process 2020
Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha [arXiv] [Code]
International Conference on Machine Learning (ICML), 2020 -
BOND: Bert-Assisted Open-Domain Named Entity Recognition with Distant Supervision 2020
Chen Liang*, Yue Yu*, Haoming Jiang*, Siawpeng Er, Ruijia Wang, Tuo Zhao and Chao Zhang (* Equal Contribution) [arXiv] [Code]
The 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2020 -
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization 2020
Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao and Tuo Zhao [arXiv] [Code]
Annual Conference of the Association for Computational Linguistics (ACL), 2020
News (2019/12/05): Rank #1 on GLUE -
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing 2020
Haoming Jiang, Chen Liang, Chong Wang and Tuo Zhao [arXiv] [Code]
Annual Conference of the Association for Computational Linguistics (ACL), 2020 -
On the Variance of the Adaptive Learning Rate and Beyond 2020
Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao and Jiawei Han [arXiv] [Code]
International Conference on Learning Representations (ICLR), 2020 -
Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds 2019
Minshuo Chen, Haoming Jiang, Wenjing Liao and Tuo Zhao [arXiv]
Annual Conference on Neural Information Processing Systems (NeurIPS), 2019 -
Meta Learning with Relational Information for Short Sequences 2019
Yujia Xie, Haoming Jiang, Feng Liu, Tuo Zhao and Hongyuan Zha [arXiv] [Code]
Annual Conference on Neural Information Processing Systems (NeurIPS), 2019 -
On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don't Worry About Its Nonsmooth Loss Function 2019
Xingguo Li, Haoming Jiang, Jarvis Haupt, Raman Arora, Han Liu, Mingyi Hong and Tuo Zhao [arXiv] [Code]
Conference on Uncertainty in Artificial Intelligence (UAI), 2019 -
On Scalable and Efficient Computation of Large Scale Optimal Transport 2019
Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha [arXiv] [Code]
International Conference on Machine Learning (ICML), 2019 -
On Computation and Generalization of Generative Adversarial Networks under Spectrum Control 2019
Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao [arXiv] [Code]
International Conference on Learning Representations (ICLR), 2019 -
Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python 2019
Jason Ge*, Xingguo Li*, Haoming Jiang, Han Liu, Tong Zhang, Mengdi Wang and Tuo Zhao (*Equal Contribution) [PDF] [R Package] [Python Package]
Journal of Machine Learning Research (JMLR), 2019 -
Contextual Text Denoising with Masked Language Model 2019
Yifu Sun, Haoming Jiang [arXiv]
Conference on Empirical Methods in Natural Language Processing (EMNLP), Workshop W-NUT, 2019 -
Designing Deployable 3D Scissor Structures with Ball-and-Socket Joints 2018
Xuejin Chen, Haoming Jiang, Tingting Xuan, Lihan Huang, Ligang Liu [PDF]
Computer Animation & Virtual Worlds (CAVW), 2018 -
Scissor-based 3D deployable contour 2017
Haoming Jiang, Xuejin Chen, Tingting Xuan, Lihan Huang, Ligang Liu [PDF]
International Conference on Virtual Reality and Visualization (ICVRV), 2017
Projects
Software⌗
-
Arxiv Viewer: Checkout the webapp for daily arxiv papers: Arxiv Viewer
-
PICASSO: PathwIse CalibrAted Sparse Shooting algOrithm [R Package] [Python Package]
-
HUGE: High-Dimensional Undirected Graph Estimation [R Package]
-
SAM: Sparse Additive Modelling [R Package]
-
ESMOTE: Efficient Synthetic Minority Over-sampling Technique [R Package]
-
Setup Toolkits: A quick setup toolkit for vim,tmux,zsh on linux server [zip]
-
FlashPythonToolbox: A few ready-to use python tools for machine learning [Github]
Others
MISC⌗
- Checkout the webapp for daily arxiv papers: Arxiv Viewer
- Innovative SMOTE for High Dimensional and Large Scaled Imbalanced Data [R Package] [Chinese Dissertation]
- Data Driven Approach for Deploying Charging Station for Electric Vehicles [Chinese Dissertation]
- Analyzing Tweets Sentiment via Machine Learning Approach [Report] [Tokenizer] [Extra Unlabeled Data] [Tweets Vectors]
- Deep Reinforcement Learning For Raiden Game [Report] [Demo]
- GeoLab, an open source 3D model processing system [GeoLab]
- Development of body interactive game based on Kinect [Report] [Video]