mirror of
https://github.com/salesforce/CodeT5.git
synced 2024-10-01 06:35:38 -04:00
update readme
This commit is contained in:
parent
3b529e206d
commit
ebf3075b24
@ -8,16 +8,6 @@ Find out more via our [blog post](https://blog.salesforceairesearch.com/codet5-o
|
|||||||
*Authors*: [Yue Wang](https://yuewang-cuhk.github.io/)\*, [Hung Le](https://sites.google.com/view/henryle2018/home?pli=1)\*, [Akhilesh Deepak Gotmare](https://akhileshgotmare.github.io/), [Nghi D.Q. Bui](https://bdqnghi.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Steven C.H. Hoi](https://sites.google.com/view/stevenhoi/home) (* indicates equal contribution)
|
*Authors*: [Yue Wang](https://yuewang-cuhk.github.io/)\*, [Hung Le](https://sites.google.com/view/henryle2018/home?pli=1)\*, [Akhilesh Deepak Gotmare](https://akhileshgotmare.github.io/), [Nghi D.Q. Bui](https://bdqnghi.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Steven C.H. Hoi](https://sites.google.com/view/stevenhoi/home) (* indicates equal contribution)
|
||||||
|
|
||||||
|
|
||||||
## Table of Contents
|
|
||||||
|
|
||||||
1. [What is this about?](#what-is-this-about)
|
|
||||||
2. [Released Models](#released-models)
|
|
||||||
3. [How to Use?](#how-to-use)
|
|
||||||
4. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
|
|
||||||
5. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
|
|
||||||
6. [Reproduce the Results](#reproduce-the-results)
|
|
||||||
7. [Citation](#citation)
|
|
||||||
|
|
||||||
# What is this about?
|
# What is this about?
|
||||||
CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. _encoder-only_, _decoder-only_, and _encoder-decoder_) to support a wide range of code understanding and generation tasks.
|
CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. _encoder-only_, _decoder-only_, and _encoder-decoder_) to support a wide range of code understanding and generation tasks.
|
||||||
|
|
||||||
@ -27,6 +17,16 @@ Furthermore, we explore instruction tuning to align the model with natural langu
|
|||||||
|
|
||||||
![CodeT5+ overview](codet5p_overview.png)
|
![CodeT5+ overview](codet5p_overview.png)
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
1. [Released Models](#released-models)
|
||||||
|
2. [How to Use?](#how-to-use)
|
||||||
|
3. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions)
|
||||||
|
4. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data)
|
||||||
|
5. [Reproduce the Results](#reproduce-the-results)
|
||||||
|
6. [Citation](#citation)
|
||||||
|
|
||||||
|
|
||||||
# Released Models
|
# Released Models
|
||||||
We implemented a family of CodeT5+ models, with model size ranging from 220M to 16B.
|
We implemented a family of CodeT5+ models, with model size ranging from 220M to 16B.
|
||||||
Note that CodeT5+ `220M` and `770M` employ the same architecture of CodeT5-base and large respectively and are pretrained from scratch, while CodeT5+ `2B`, `6B`, `16B` employ a "_shallow encoder and deep decoder_" architecture with the shallow encoder initialized from CodeGen-mono 350M and the deep decoder initialized from CodeGen-mono 2B, 6B, 16B, respectively.
|
Note that CodeT5+ `220M` and `770M` employ the same architecture of CodeT5-base and large respectively and are pretrained from scratch, while CodeT5+ `2B`, `6B`, `16B` employ a "_shallow encoder and deep decoder_" architecture with the shallow encoder initialized from CodeGen-mono 350M and the deep decoder initialized from CodeGen-mono 2B, 6B, 16B, respectively.
|
||||||
@ -94,31 +94,32 @@ This script naturally supports both single-GPU and multi-GPU training. If you ha
|
|||||||
# Reproduce the Results
|
# Reproduce the Results
|
||||||
|
|
||||||
## HumanEval
|
## HumanEval
|
||||||
Our CodeT5+ models achieves strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
|
Our CodeT5+ models achieve very strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively).
|
||||||
|
|
||||||
| Model | Pass@1 | Pass@10 | Pass@100 |
|
| Model | Pass@1 | Pass@10 | Pass@100 |
|
||||||
|---------------------|----------|----------|----------|
|
|-------------------------|----------|----------|----------|
|
||||||
| LLaMA 7B | 10.5 | - | 36.5 |
|
| LLaMA 7B | 10.5 | - | 36.5 |
|
||||||
| LaMDA 137B | 14.0 | - | 47.3 |
|
| LaMDA 137B | 14.0 | - | 47.3 |
|
||||||
| InCoder 6B | 15.2 | 27.8 | 47.0 |
|
| InCoder 6B | 15.2 | 27.8 | 47.0 |
|
||||||
| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 |
|
| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 |
|
||||||
| CodeT5+ 770M | 15.5 | 27.2 | 42.7 |
|
| **CodeT5+ 770M** | 15.5 | 27.2 | 42.7 |
|
||||||
| LLaMA 13B | 15.8 | - | 52.5 |
|
| LLaMA 13B | 15.8 | - | 52.5 |
|
||||||
| PaLM 62B | 15.9 | - | 46.3 |
|
| PaLM 62B | 15.9 | - | 46.3 |
|
||||||
| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 |
|
| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 |
|
||||||
| LLaMA 33B | 21.7 | - | 70.7 |
|
| LLaMA 33B | 21.7 | - | 70.7 |
|
||||||
| Replit 3B | 21.9 | - | - |
|
| Replit 3B | 21.9 | - | - |
|
||||||
| CodeGeeX 13B | 22.9 | 39.6 | 60.9 |
|
| CodeGeeX 13B | 22.9 | 39.6 | 60.9 |
|
||||||
| LLaMA 65B | 23.7 | - | 79.3 |
|
| LLaMA 65B | 23.7 | - | 79.3 |
|
||||||
| PaLM 540B | 26.2 | - | 76.2 |
|
| PaLM 540B | 26.2 | - | 76.2 |
|
||||||
| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 |
|
| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 |
|
||||||
| CodeT5+ 16B | 30.9 | 51.6 | 76.7 |
|
| **CodeT5+ 16B** | 30.9 | 51.6 | 76.7 |
|
||||||
| code-cushman-001 | 33.5 | 54.3 | 77.4 |
|
| code-cushman-001 | 33.5 | 54.3 | 77.4 |
|
||||||
| StarCoder 15B | 33.6 | - | - |
|
| StarCoder 15B | 33.6 | - | - |
|
||||||
| InstructCodeT5+ 16B | **36.1** | **57.1** | **80.7** |
|
| **InstructCodeT5+ 16B** | **36.1** | **57.1** | **80.7** |
|
||||||
|
|
||||||
Please follow the instructions below to reproduce the results.
|
Please follow the instructions below to reproduce the results.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Installation
|
### Installation
|
||||||
* Install the official HumanEval evaluation tool released by OpenAI following the instructions in this [repo](https://github.com/openai/human-eval).
|
* Install the official HumanEval evaluation tool released by OpenAI following the instructions in this [repo](https://github.com/openai/human-eval).
|
||||||
|
Loading…
Reference in New Issue
Block a user