diff --git a/CodeT5+/README.md b/CodeT5+/README.md index b34265b..bd175f3 100644 --- a/CodeT5+/README.md +++ b/CodeT5+/README.md @@ -8,16 +8,6 @@ Find out more via our [blog post](https://blog.salesforceairesearch.com/codet5-o *Authors*: [Yue Wang](https://yuewang-cuhk.github.io/)\*, [Hung Le](https://sites.google.com/view/henryle2018/home?pli=1)\*, [Akhilesh Deepak Gotmare](https://akhileshgotmare.github.io/), [Nghi D.Q. Bui](https://bdqnghi.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Steven C.H. Hoi](https://sites.google.com/view/stevenhoi/home) (* indicates equal contribution) -## Table of Contents - -1. [What is this about?](#what-is-this-about) -2. [Released Models](#released-models) -3. [How to Use?](#how-to-use) -4. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions) -5. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data) -6. [Reproduce the Results](#reproduce-the-results) -7. [Citation](#citation) - # What is this about? CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. _encoder-only_, _decoder-only_, and _encoder-decoder_) to support a wide range of code understanding and generation tasks. @@ -27,6 +17,16 @@ Furthermore, we explore instruction tuning to align the model with natural langu ![CodeT5+ overview](codet5p_overview.png) +## Table of Contents + +1. [Released Models](#released-models) +2. [How to Use?](#how-to-use) +3. [Instruction Tuning to Align with Natural Language Instructions](#instruction-tuning-to-align-with-natural-language-instructions) +4. [How to Finetune Using Your Own Data?](#how-to-finetune-using-your-own-data) +5. [Reproduce the Results](#reproduce-the-results) +6. [Citation](#citation) + + # Released Models We implemented a family of CodeT5+ models, with model size ranging from 220M to 16B. Note that CodeT5+ `220M` and `770M` employ the same architecture of CodeT5-base and large respectively and are pretrained from scratch, while CodeT5+ `2B`, `6B`, `16B` employ a "_shallow encoder and deep decoder_" architecture with the shallow encoder initialized from CodeGen-mono 350M and the deep decoder initialized from CodeGen-mono 2B, 6B, 16B, respectively. @@ -94,31 +94,32 @@ This script naturally supports both single-GPU and multi-GPU training. If you ha # Reproduce the Results ## HumanEval -Our CodeT5+ models achieves strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively). +Our CodeT5+ models achieve very strong results on HumanEval benchmark in zero-shot setting. We follow common practices to employ nucleus sampling with different temperature `T` for computing `Pass@k` (`T=0.2,0.6,0.8` for `k=1,10,100` respectively). -| Model | Pass@1 | Pass@10 | Pass@100 | -|---------------------|----------|----------|----------| -| LLaMA 7B | 10.5 | - | 36.5 | -| LaMDA 137B | 14.0 | - | 47.3 | -| InCoder 6B | 15.2 | 27.8 | 47.0 | -| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 | -| CodeT5+ 770M | 15.5 | 27.2 | 42.7 | -| LLaMA 13B | 15.8 | - | 52.5 | -| PaLM 62B | 15.9 | - | 46.3 | -| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 | -| LLaMA 33B | 21.7 | - | 70.7 | -| Replit 3B | 21.9 | - | - | -| CodeGeeX 13B | 22.9 | 39.6 | 60.9 | -| LLaMA 65B | 23.7 | - | 79.3 | -| PaLM 540B | 26.2 | - | 76.2 | -| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 | -| CodeT5+ 16B | 30.9 | 51.6 | 76.7 | -| code-cushman-001 | 33.5 | 54.3 | 77.4 | -| StarCoder 15B | 33.6 | - | - | -| InstructCodeT5+ 16B | **36.1** | **57.1** | **80.7** | +| Model | Pass@1 | Pass@10 | Pass@100 | +|-------------------------|----------|----------|----------| +| LLaMA 7B | 10.5 | - | 36.5 | +| LaMDA 137B | 14.0 | - | 47.3 | +| InCoder 6B | 15.2 | 27.8 | 47.0 | +| GPT-NeoX 20B | 15.4 | 25.6 | 41.2 | +| **CodeT5+ 770M** | 15.5 | 27.2 | 42.7 | +| LLaMA 13B | 15.8 | - | 52.5 | +| PaLM 62B | 15.9 | - | 46.3 | +| AlphaCode 1.1B | 17.1 | 28.2 | 45.3 | +| LLaMA 33B | 21.7 | - | 70.7 | +| Replit 3B | 21.9 | - | - | +| CodeGeeX 13B | 22.9 | 39.6 | 60.9 | +| LLaMA 65B | 23.7 | - | 79.3 | +| PaLM 540B | 26.2 | - | 76.2 | +| CodeGen-mono 16B | 29.3 | 49.9 | 75.0 | +| **CodeT5+ 16B** | 30.9 | 51.6 | 76.7 | +| code-cushman-001 | 33.5 | 54.3 | 77.4 | +| StarCoder 15B | 33.6 | - | - | +| **InstructCodeT5+ 16B** | **36.1** | **57.1** | **80.7** | Please follow the instructions below to reproduce the results. +--- ### Installation * Install the official HumanEval evaluation tool released by OpenAI following the instructions in this [repo](https://github.com/openai/human-eval).