Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tatoeba #802

Open
wants to merge 32 commits into
base: eval-hackathon
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
8206ffc
add EN prompts for xcopa
Jul 15, 2022
6af75cb
update ids (2/5)
Jul 15, 2022
1d4d0cc
regenerate ids (4/6)
Jul 15, 2022
e020740
regenerate ids (5/6)
Jul 15, 2022
8511070
regenerate ids (6/6)
Jul 15, 2022
cc65cf3
fix typo
Jul 15, 2022
670e6ad
fix last keyerror?
Jul 15, 2022
f906000
Merge pull request #1 from haileyschoelkopf/xcopa
Muennighoff Jul 16, 2022
c56ce86
Add xwinograd/en
Muennighoff Jul 17, 2022
d164dbf
Duplicate template
Muennighoff Jul 17, 2022
0bfade4
Format
Muennighoff Jul 18, 2022
33ab12e
Merge pull request #2 from bigscience-workshop/eval-hackathon
Muennighoff Jul 18, 2022
c80db69
wip -test
VictorSanh Jul 18, 2022
12ef8de
wip - test
VictorSanh Jul 18, 2022
35e29a8
wip - most stupid
VictorSanh Jul 18, 2022
4d5c500
wip - test
VictorSanh Jul 18, 2022
62ee1a8
de
VictorSanh Jul 18, 2022
1d55c8d
re-clean it - have not make it work yet
VictorSanh Jul 18, 2022
fcc9b8c
Change IDs
Muennighoff Jul 18, 2022
624878f
Merge branch 'tr13' into muennighoff/xwinogrande
Muennighoff Jul 19, 2022
9a38a4b
Merge pull request #3 from Muennighoff/muennighoff/xwinogrande
Muennighoff Jul 19, 2022
619daa1
Merge pull request #4 from bigscience-workshop/eval-hackathon
Muennighoff Jul 19, 2022
855549f
Merge pull request #5 from Muennighoff/eval-hackathon
Muennighoff Jul 19, 2022
3053aa0
Rmv incompat prompts
Muennighoff Jul 19, 2022
36477c1
Add eng template xcopa
Muennighoff Jul 19, 2022
78dd720
Assimilate en
Muennighoff Jul 19, 2022
ba36b0c
例えば
Muennighoff Jul 19, 2022
d05a9ee
Remove var
Muennighoff Jul 19, 2022
8125d23
Remove dup
Muennighoff Jul 19, 2022
0815263
Merge branch 'tr13' into tatoeba
Muennighoff Jul 19, 2022
be399eb
Swap source & target; Rmv script
Muennighoff Jul 21, 2022
45a3321
t pMerge branch 'tatoeba' of https://github.com/Muennighoff/promptsou…
Muennighoff Jul 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion promptsource/templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,19 @@
# These are users whose datasets should be included in the results returned by
# filter_english_datasets (regardless of their metadata)

INCLUDED_USERS = {"Zaid", "craffel", "GEM", "aps", "khalidalt", "shanya", "rbawden", "BigScienceBiasEval", "gsarti"}
INCLUDED_USERS = {
"Zaid",
"craffel",
"GEM",
"aps",
"khalidalt",
"shanya",
"rbawden",
"BigScienceBiasEval",
"gsarti",
"Helsinki-NLP",
"Muennighoff",
}

# These are the metrics with which templates can be tagged
METRICS = {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: ara-eng
templates:
2ca27188-17c3-4e83-884c-90c93f53821a: !Template
answer_choices: null
id: 2ca27188-17c3-4e83-884c-90c93f53821a
jinja: Translate the following text from English to Arabic {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-ara
reference: ''
3a6ba80f-2b5d-471d-93cc-208a74d09b25: !Template
answer_choices: null
id: 3a6ba80f-2b5d-471d-93cc-208a74d09b25
jinja: Translate the following text from Arabic to English {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-ara-eng
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: ara-fra
templates:
a867b103-a1c4-43b4-ae2e-eddc3cb3b890: !Template
answer_choices: null
id: a867b103-a1c4-43b4-ae2e-eddc3cb3b890
jinja: Translate the following text from French to Arabic {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-fra-ara
reference: ''
e053166a-1a47-42e9-9970-27faad375e2c: !Template
answer_choices: null
id: e053166a-1a47-42e9-9970-27faad375e2c
jinja: Translate the following text from Arabic to French {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-ara-fra
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: ara-spa
templates:
60875520-509d-4390-b37c-feab2414e139: !Template
answer_choices: null
id: 60875520-509d-4390-b37c-feab2414e139
jinja: Translate the following text from Spanish to Arabic {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-spa-ara
reference: ''
6fd44ef5-8bfa-45fd-88ea-d52f0abc2954: !Template
answer_choices: null
id: 6fd44ef5-8bfa-45fd-88ea-d52f0abc2954
jinja: Translate the following text from Arabic to Spanish {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-ara-spa
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: ben-eng
templates:
3e6cf1b1-5bdc-4401-a440-670e34f9d7f0: !Template
answer_choices: null
id: 3e6cf1b1-5bdc-4401-a440-670e34f9d7f0
jinja: Translate the following text from English to Bengali {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-ben
reference: ''
c3409e18-3cc6-427a-a31a-a536df9d1d83: !Template
answer_choices: null
id: c3409e18-3cc6-427a-a31a-a536df9d1d83
jinja: Translate the following text from Bengali to English {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-ben-eng
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: cat-eng
templates:
28d68d88-1971-4c2f-bf9f-dacf247964e3: !Template
answer_choices: null
id: 28d68d88-1971-4c2f-bf9f-dacf247964e3
jinja: Translate the following text from English to Catalan {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-cat
reference: ''
b5fb150a-5b6d-460e-81a3-53ba4627ca9e: !Template
answer_choices: null
id: b5fb150a-5b6d-460e-81a3-53ba4627ca9e
jinja: Translate the following text from Catalan to English {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cat-eng
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: cat-fra
templates:
047e2324-7372-4996-98ef-32c39aa49605: !Template
answer_choices: null
id: 047e2324-7372-4996-98ef-32c39aa49605
jinja: Translate the following text from French to Catalan {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-fra-cat
reference: ''
9221d310-8f77-40a5-830c-1b8c6b79c85c: !Template
answer_choices: null
id: 9221d310-8f77-40a5-830c-1b8c6b79c85c
jinja: Translate the following text from Catalan to French {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cat-fra
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: cat-por
templates:
106fd81a-4816-4644-bdb5-2c8302413b55: !Template
answer_choices: null
id: 106fd81a-4816-4644-bdb5-2c8302413b55
jinja: Translate the following text from Portuguese to Catalan {{ targetString
}}||| {{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-por-cat
reference: ''
b11dfc4f-387d-4c8c-a992-df7a67c50d8b: !Template
answer_choices: null
id: b11dfc4f-387d-4c8c-a992-df7a67c50d8b
jinja: Translate the following text from Catalan to Portuguese {{ sourceString
}}||| {{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cat-por
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: cat-spa
templates:
427fce5b-4075-470c-9882-4fae08013452: !Template
answer_choices: null
id: 427fce5b-4075-470c-9882-4fae08013452
jinja: Translate the following text from Catalan to Spanish {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cat-spa
reference: ''
8178c5b0-a3dc-486d-ba90-7fc097fd4494: !Template
answer_choices: null
id: 8178c5b0-a3dc-486d-ba90-7fc097fd4494
jinja: Translate the following text from Spanish to Catalan {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-spa-cat
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: eng-cmn_Hans
templates:
81c4007c-70ba-4c70-b2e5-ea8c855a1b6e: !Template
answer_choices: null
id: 81c4007c-70ba-4c70-b2e5-ea8c855a1b6e
jinja: Translate the following text from Mandarin Chinese to English {{ targetString
}}||| {{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cmn_Hans-eng
reference: ''
a7c97c86-6eac-4dd6-969c-3e083851d8f7: !Template
answer_choices: null
id: a7c97c86-6eac-4dd6-969c-3e083851d8f7
jinja: Translate the following text from English to Mandarin Chinese {{ sourceString
}}||| {{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-cmn_Hans
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: eng-cmn_Hant
templates:
48d886b8-8476-4d36-b751-b1add1c9b0dd: !Template
answer_choices: null
id: 48d886b8-8476-4d36-b751-b1add1c9b0dd
jinja: Translate the following text from Mandarin Chinese to English {{ targetString
}}||| {{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-cmn_Hant-eng
reference: ''
8f6c9133-105d-46b1-810e-3224aac8df83: !Template
answer_choices: null
id: 8f6c9133-105d-46b1-810e-3224aac8df83
jinja: Translate the following text from English to Mandarin Chinese {{ sourceString
}}||| {{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-cmn_Hant
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: eng-eus
templates:
87630907-872e-43fd-a965-c29bcfa44c4a: !Template
answer_choices: null
id: 87630907-872e-43fd-a965-c29bcfa44c4a
jinja: Translate the following text from English to Basque {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-eus
reference: ''
ecdccaef-f7da-4b90-a40b-add38421c9d2: !Template
answer_choices: null
id: ecdccaef-f7da-4b90-a40b-add38421c9d2
jinja: Translate the following text from Basque to English {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eus-eng
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: eng-fra
templates:
115c04c7-43ea-4681-bc52-304495acc3be: !Template
answer_choices: null
id: 115c04c7-43ea-4681-bc52-304495acc3be
jinja: Translate the following text from French to English {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-fra-eng
reference: ''
fe9f5bcc-158e-4639-b7c9-b4721d376bc1: !Template
answer_choices: null
id: fe9f5bcc-158e-4639-b7c9-b4721d376bc1
jinja: Translate the following text from English to French {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-fra
reference: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
dataset: Helsinki-NLP/tatoeba_mt
subset: eng-hin
templates:
5bd1ad4b-d9ea-4a4d-bbfa-0f119142cca0: !Template
answer_choices: null
id: 5bd1ad4b-d9ea-4a4d-bbfa-0f119142cca0
jinja: Translate the following text from English to Hindi {{ sourceString }}|||
{{ targetString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-eng-hin
reference: ''
85f8628a-86aa-485f-8f59-e4a5c4a3e371: !Template
answer_choices: null
id: 85f8628a-86aa-485f-8f59-e4a5c4a3e371
jinja: Translate the following text from Hindi to English {{ targetString }}|||
{{ sourceString }}
metadata: !TemplateMetadata
choices_in_prompt: null
languages: null
metrics: null
original_task: null
name: translate-this-hin-eng
reference: ''
Loading