update readme

2024-10-01 06:35:38 -04:00 · 2023-07-12 16:00:02 +08:00 · 2023-07-12 16:00:02 +08:00 · ebf3075b24
commit ebf3075b24
parent 3b529e206d
1 changed files with 32 additions and 31 deletions
--- a/CodeT5+/README.md
+++ b/CodeT5+/README.md
@ -8,16 +8,6 @@ Find out more via our [blog post](https://blog.salesforceairesearch.com/codet5-o
 *Authors*: [Yue Wang](https://yuewang-cuhk.github.io/)\*, [Hung Le](https://sites.google.com/view/henryle2018/home?pli=1)\*, [Akhilesh Deepak Gotmare](https://akhileshgotmare.github.io/), [Nghi D.Q. Bui](https://bdqnghi.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Steven C.H. Hoi](https://sites.google.com/view/stevenhoi/home) (* indicates equal contribution)
 ## Table of Contents
 1. [What is this about?](#what-is-this-about)
 2. [Released Models](#released-models)
 3. [How to Use?](#how-to-use)
 4. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
 5. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
 6. [Reproduce the Results](#reproduce-the-results)
 7. [Citation](#citation)
 # What is this about?
 CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. _encoder-only_, _decoder-only_, and _encoder-decoder_) to support a wide range of code understanding and generation tasks.
@ -27,6 +17,16 @@ Furthermore, we explore instruction tuning to align the model with natural langu
 ![CodeT5+ overview](codet5p_overview.png)
 ## Table of Contents
 1. [Released Models](#released-models)
 2. [How to Use?](#how-to-use)
 3. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
 4. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
 5. [Reproduce the Results](#reproduce-the-results)
 6. [Citation](#citation)
 # Released Models
 We implemented a family of CodeT5+ models, with model size ranging from 220M to 16B. 
 Note that CodeT5+ `220M` and `770M` employ the same architecture of CodeT5-base and large respectively and are pretrained from scratch, while CodeT5+ `2B`, `6B`, `16B` employ a "_shallow encoder and deep decoder_" architecture with the shallow encoder initialized from CodeGen-mono 350M and the deep decoder initialized from CodeGen-mono 2B, 6B, 16B, respectively.
@ -94,31 +94,32 @@ This script naturally supports both single-GPU and multi-GPU training. If you ha
 # Reproduce the Results
 ## HumanEval
-Our CodeT5+ models achieves strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
+Our CodeT5+ models achieve very strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
-| Model               | Pass@1   | Pass@10  | Pass@100 |
+| Model                   | Pass@1   | Pass@10  | Pass@100 |
-|---------------------|----------|----------|----------|
+|-------------------------|----------|----------|----------|
-| LLaMA 7B            | 10.5     | -        | 36.5     |
+| LLaMA 7B                | 10.5     | -        | 36.5     |
-| LaMDA  137B         | 14.0     | -        | 47.3     |
+| LaMDA  137B             | 14.0     | -        | 47.3     |
-| InCoder 6B          | 15.2     | 27.8     | 47.0     |
+| InCoder 6B              | 15.2     | 27.8     | 47.0     |
-| GPT-NeoX 20B        | 15.4     | 25.6     | 41.2     |
+| GPT-NeoX 20B            | 15.4     | 25.6     | 41.2     |
-| CodeT5+ 770M        | 15.5     | 27.2     | 42.7     |
+| **CodeT5+ 770M**        | 15.5     | 27.2     | 42.7     |
-| LLaMA 13B           | 15.8     | -        | 52.5     |
+| LLaMA 13B               | 15.8     | -        | 52.5     |
-| PaLM  62B           | 15.9     | -        | 46.3     |
+| PaLM  62B               | 15.9     | -        | 46.3     |
-| AlphaCode 1.1B      | 17.1     | 28.2     | 45.3     |
+| AlphaCode 1.1B          | 17.1     | 28.2     | 45.3     |
-| LLaMA 33B           | 21.7     | -        | 70.7     |
+| LLaMA 33B               | 21.7     | -        | 70.7     |
-| Replit 3B           | 21.9     | -        | -        |
+| Replit 3B               | 21.9     | -        | -        |
-| CodeGeeX 13B        | 22.9     | 39.6     | 60.9     |
+| CodeGeeX 13B            | 22.9     | 39.6     | 60.9     |
-| LLaMA 65B           | 23.7     | -        | 79.3     |
+| LLaMA 65B               | 23.7     | -        | 79.3     |
-| PaLM  540B          | 26.2     | -        | 76.2     |
+| PaLM  540B              | 26.2     | -        | 76.2     |
-| CodeGen-mono 16B    | 29.3     | 49.9     | 75.0     |
+| CodeGen-mono 16B        | 29.3     | 49.9     | 75.0     |
-| CodeT5+ 16B         | 30.9     | 51.6     | 76.7     |
+| **CodeT5+ 16B**         | 30.9     | 51.6     | 76.7     |
-| code-cushman-001    | 33.5     | 54.3     | 77.4     |
+| code-cushman-001        | 33.5     | 54.3     | 77.4     |
-| StarCoder 15B       | 33.6     | -        | -        |
+| StarCoder 15B           | 33.6     | -        | -        |
-| InstructCodeT5+ 16B | **36.1** | **57.1** | **80.7** |
+| **InstructCodeT5+ 16B** | **36.1** | **57.1** | **80.7** |
 Please follow the instructions below to reproduce the results.
 ---
 ### Installation
 * Install the official HumanEval evaluation tool released by OpenAI following the instructions in this [repo](https://github.com/openai/human-eval).