update readme

This commit is contained in:
WANG Yue 2023-07-12 16:00:02 +08:00
parent 3b529e206d
commit ebf3075b24

View File

@ -8,16 +8,6 @@ Find out more via our [blog post](https://blog.salesforceairesearch.com/codet5-o
*Authors*: [Yue Wang](https://yuewang-cuhk.github.io/)\*, [Hung Le](https://sites.google.com/view/henryle2018/home?pli=1)\*, [Akhilesh Deepak Gotmare](https://akhileshgotmare.github.io/), [Nghi D.Q. Bui](https://bdqnghi.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Steven C.H. Hoi](https://sites.google.com/view/stevenhoi/home) (* indicates equal contribution)
## Table of Contents
1. [What is this about?](#what-is-this-about)
2. [Released Models](#released-models)
3. [How to Use?](#how-to-use)
4. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
5. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
6. [Reproduce the Results](#reproduce-the-results)
7. [Citation](#citation)
# What is this about?
CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. _encoder-only_, _decoder-only_, and _encoder-decoder_) to support a wide range of code understanding and generation tasks.
@ -27,6 +17,16 @@ Furthermore, we explore instruction tuning to align the model with natural langu
![CodeT5+ overview](codet5p_overview.png)
## Table of Contents
1. [Released Models](#released-models)
2. [How to Use?](#how-to-use)
3. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
4. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
5. [Reproduce the Results](#reproduce-the-results)
6. [Citation](#citation)
# Released Models
We implemented a family of CodeT5+ models, with model size ranging from 220M to 16B.
Note that CodeT5+ `220M` and `770M` employ the same architecture of CodeT5-base and large respectively and are pretrained from scratch, while CodeT5+ `2B`, `6B`, `16B` employ a "_shallow encoder and deep decoder_" architecture with the shallow encoder initialized from CodeGen-mono 350M and the deep decoder initialized from CodeGen-mono 2B, 6B, 16B, respectively.
@ -94,31 +94,32 @@ This script naturally supports both single-GPU and multi-GPU training. If you ha
# Reproduce the Results
## HumanEval
Our CodeT5+ models achieves strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
Our CodeT5+ models achieve very strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
| Model | Pass@1 | Pass@10 | Pass@100 |
|---------------------|----------|----------|----------|
| LLaMA 7B | 10.5 | - | 36.5 |
| LaMDA 137B | 14.0 | - | 47.3 |
| InCoder 6B | 15.2 | 27.8 | 47.0 |
| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 |
| CodeT5+ 770M | 15.5 | 27.2 | 42.7 |
| LLaMA 13B | 15.8 | - | 52.5 |
| PaLM 62B | 15.9 | - | 46.3 |
| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 |
| LLaMA 33B | 21.7 | - | 70.7 |
| Replit 3B | 21.9 | - | - |
| CodeGeeX 13B | 22.9 | 39.6 | 60.9 |
| LLaMA 65B | 23.7 | - | 79.3 |
| PaLM 540B | 26.2 | - | 76.2 |
| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 |
| CodeT5+ 16B | 30.9 | 51.6 | 76.7 |
| code-cushman-001 | 33.5 | 54.3 | 77.4 |
| StarCoder 15B | 33.6 | - | - |
| InstructCodeT5+ 16B | **36.1** | **57.1** | **80.7** |
| Model | Pass@1 | Pass@10 | Pass@100 |
|-------------------------|----------|----------|----------|
| LLaMA 7B | 10.5 | - | 36.5 |
| LaMDA 137B | 14.0 | - | 47.3 |
| InCoder 6B | 15.2 | 27.8 | 47.0 |
| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 |
| **CodeT5+ 770M** | 15.5 | 27.2 | 42.7 |
| LLaMA 13B | 15.8 | - | 52.5 |
| PaLM 62B | 15.9 | - | 46.3 |
| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 |
| LLaMA 33B | 21.7 | - | 70.7 |
| Replit 3B | 21.9 | - | - |
| CodeGeeX 13B | 22.9 | 39.6 | 60.9 |
| LLaMA 65B | 23.7 | - | 79.3 |
| PaLM 540B | 26.2 | - | 76.2 |
| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 |
| **CodeT5+ 16B** | 30.9 | 51.6 | 76.7 |
| code-cushman-001 | 33.5 | 54.3 | 77.4 |
| StarCoder 15B | 33.6 | - | - |
| **InstructCodeT5+ 16B** | **36.1** | **57.1** | **80.7** |
Please follow the instructions below to reproduce the results.
---
### Installation
* Install the official HumanEval evaluation tool released by OpenAI following the instructions in this [repo](https://github.com/openai/human-eval).