Skip to content

Commit 0503b1b

Browse files
committed
updates for 345M model
1 parent d14501a commit 0503b1b

File tree

5 files changed

+9
-5
lines changed

5 files changed

+9
-5
lines changed

Diff for: .gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
__pycache__
2+
.mypy_cache/
23
models/

Diff for: DEVELOPERS.md

+1
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ pip3 install -r requirements.txt
2828
Download the model data
2929
```
3030
python3 download_model.py 117M
31+
python3 download_model.py 345M
3132
```
3233

3334
## Docker Installation

Diff for: Dockerfile.cpu

+1
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ WORKDIR /gpt-2
66
ADD . /gpt-2
77
RUN pip3 install -r requirements.txt
88
RUN python3 download_model.py 117M
9+
RUN python3 download_model.py 345M

Diff for: Dockerfile.gpu

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ WORKDIR /gpt-2
1515
ADD . /gpt-2
1616
RUN pip3 install -r requirements.txt
1717
RUN python3 download_model.py 117M
18+
RUN python3 download_model.py 345M

Diff for: README.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,23 @@
22

33
Code and samples from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).
44

5-
For now, we have only released a smaller (117M parameter) version of GPT-2.
5+
We have currently released small (117M parameter) and medium (345M parameter) versions of GPT-2.
66

77
See more details in our [blog post](https://blog.openai.com/better-language-models/).
88

99
## Usage
1010

11-
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2-117M. While GPT-2-117M is less proficient than GPT-2-1.5B, it is useful for a wide range of research and applications which could also apply to larger models.
11+
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2.
1212

1313
### Some caveats
1414

15-
- GPT-2-117M robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2-117M for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
16-
- The dataset our GPT-2-117M was trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2-117M is likely to be biased and inaccurate as well.
15+
- GPT-2 models' robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2 for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
16+
- The dataset our GPT-2 models were trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2 models are likely to be biased and inaccurate as well.
1717
- To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.
1818

1919
### Work with us
2020

21-
Please [let us know](mailto:[email protected]) if you’re doing interesting research with or working on applications of GPT-2-117M! We’re especially interested in hearing from and potentially working with those who are studying
21+
Please [let us know](mailto:[email protected]) if you’re doing interesting research with or working on applications of GPT-2! We’re especially interested in hearing from and potentially working with those who are studying
2222
- Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
2323
- The extent of problematic content (e.g. bias) being baked into the models and effective mitigations
2424

0 commit comments

Comments
 (0)