(back to homepage)

Scaling instruction-finetuned language models.
{H. W. Chung, L. Hou, S. Longpre}, B. Zoph, Y. Tay, W. Fedus, Y. Li, X. Wang, M. Dehghani, S. Brahma, A. Webson, S. Gu, Z. Dai, M. Suzgun, X. Chen, A. Chowdhery, S. Narang, G. Mishra, A. Yu, V. Zhao, Y. Huang, A. Dai, H. Yu, S. Petrov, E. Chi, J. Dean, J. Devlin, A. Roberts, D. Zhou, Q. Le, and J. Wei.
Transcending scaling laws with 0.1% extra compute.
Y. Tay, J. Wei, H. W. Chung, V. Tran, D. So, S. Shakeri, X. Garcia, H. Zheng, J. Rao, A. Chowdhery, D. Zhou, D. Metzler, S. Petrov, N. Houlsby, Q. Le, and M. Dehghani.
Challenging BIG-Bench tasks and whether chain-of-thought can solve them.
M. Suzgun, N. Scales, N. Schärli, S. Gehrmann, Y. Tay, H. W. Chung, A. Chowdhery, Q. V. Le, E. Chi, D. Zhou, and J. Wei.
Mind's Eye: Grounded language model reasoning through simulation.
R. Liu, J. Wei, S. Gu, T. Wu, S. Vosoughi, C. Cui, D. Zhou, and A. Dai.
Language models are multilingual chain-of-thought reasoners.
{F. Shi, M. Suzgun}, M. Freitag, X. Wang, S. Srivats, S. Vosoughi, H. W. Chung, Y. Tay, S. Ruder, D. Zhou, D. Das, and J. Wei.
UL2: Unifying language learning paradigms.
Y. Tay, M. Dehghani, V. Tran, X. Garcia, J. Wei, X. Wang, H. Chung, D. Bahri, T. Schuster, H. Zheng, D. Zhou, N. Houlsby, and D. Metzler.
Least-to-most prompting enables complex reasoning in large language models.
D. Zhou, N. Schärli, L. Hou, J. Wei, N. Scales, X. Wang, D. Schuurmans, O. Bousquet, Q. Le, and E. Chi.
PaLM: Scaling language modeling with Pathways.
{A. Chowdhery, S. Narang, J. Devlin} and 64 additional authors.
Self-consistency improves chain of thought reasoning in language models.
X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou.
TMLR '22Emergent abilities of large language models.
J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler, E. Chi, T. Hashimoto, O. Vinyals, P. Liang, J. Dean, and W. Fedus.
Stanford HAI blog
NeurIPS '22Chain-of-thought prompting elicits reasoning in large language models.
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou.
Sundar explains chain of thought prompting at Google I/O 2022 / Google AI blog
ACL '22A recipe for arbitrary text style transfer with large language models.
{E. Reif, D. Ippolito}, A. Yuan, A. Coenen, C. Callison-Burch, and J. Wei.
ICLR '22Finetuned language models are zero-shot learners.
{J. Wei, M. Bosma, V. Zhao, K. Guu}, A. Yu, B. Lester, N. Du, A. Dai, and Q. Le.
Google AI blog / oral
ICLR '22The MultiBERTs: BERT reproductions for robustness analysis.
{T. Sellam, S. Yadlowsky}, I. Tenney, J. Wei, N. Saphra, A. D'Amour, T. Linzen, J. Bastings, I. Turc, J. Eisenstein, D. Das, and E. Pavlick.
EMNLP '21Frequency effects on syntactic rule learning in transformers.
J. Wei, D. Garrette, T. Linzen, and E. Pavlick. Google AI blog / oral
EMNLP '21Good-enough example extrapolation.
J. Wei.
ACL '21A cognitive regularizer for language modeling.
J. Wei, C. Meister, and R. Cotterell.
ACL '21Language model augmented relevance score.
R. Liu, J. Wei, and S. Vosoughi.
ACL '21A survey of data augmentation approaches for NLP.
(Findings){S. Feng, V. Gangal}, J. Wei, S. Chandar, S. Vosoughi, T. Mitamura, and E. Hovy.
ACL '21Modulating language models with emotions.
(Findings)R. Liu, J. Wei, C. Jia, and S. Vosoughi.
NAACL '21Linguistic complexity loss in text-based therapy.
J. Wei, K. Finn, E. Templeton, T. Wheatley, and S. Vosoughi.
NAACL '21Few-shot text classification with triplet networks, data augmentation, and curriculum learning.
J. Wei, C. Huang, S. Vosoughi, Y. Cheng, and S. Xu.
EACL '21Text augmentation in a multi-task view.
J. Wei, C. Huang, S. Xu, and S. Vosoughi.
AAAI '21Mitigating political bias in language models through reinforced calibration (outstanding paper).
R. Liu, C. Jia, J. Wei, G. Xu, L. Wang, and S. Vosoughi.
EMNLP '19Easy data augmentation techniques for boosting performance on text classification tasks.
J. Wei and K. Zou.