forked from microsoft/promptflow
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathflow.dag.yaml
263 lines (263 loc) · 17.2 KB
/
flow.dag.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
environment:
python_requirements_txt: requirements.txt
inputs:
question:
type: string
default: What is the name of the new language representation model introduced in
the document?
answer:
type: string
default: The document mentions multiple language representation models, so it is
unclear which one is being referred to as \"new\". Can you provide more
specific information or context?
context:
type: string
default: '["statistical language modeling. arXiv preprint arXiv:1312.3005 . Z.
Chen, H. Zhang, X. Zhang, and L. Zhao. 2018. Quora question pairs.
Christopher Clark and Matt Gardner. 2018. Simple and effective
multi-paragraph reading comprehen- sion. In ACL.Kevin Clark, Minh-Thang
Luong, Christopher D Man- ning, and Quoc Le. 2018. Semi-supervised se-
quence modeling with cross-view training. In Pro- ceedings of the 2018
Conference on Empirical Meth- ods in Natural Language Processing , pages
1914\u2013 1925. Ronan Collobert and Jason Weston. 2008. A uni\ufb01ed
architecture for natural language processing: Deep neural networks with
multitask learning. In Pro- ceedings of the 25th international conference
on Machine learning , pages 160\u2013167. ACM. Alexis Conneau, Douwe
Kiela, Holger Schwenk, Lo \u00a8\u0131c Barrault, and Antoine Bordes.
2017. Supervised learning of universal sentence representations from
natural language inference data. In Proceedings of the 2017 Conference on
Empirical Methods in Nat- ural Language Processing , pages 670\u2013680,
Copen- hagen, Denmark. Association for Computational Linguistics. Andrew M
Dai and Quoc V Le. 2015. Semi-supervised sequence learning. In Advances in
neural informa- tion processing systems , pages 3079\u20133087. J. Deng,
W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei- Fei. 2009. ImageNet: A
Large-Scale Hierarchical Image Database. In CVPR09 . William B Dolan and
Chris Brockett. 2005. Automati- cally constructing a corpus of sentential
paraphrases. InProceedings of the Third International Workshop on
Paraphrasing (IWP2005) . William Fedus, Ian Goodfellow, and Andrew M Dai.
2018. Maskgan: Better text generation via \ufb01lling in the.arXiv
preprint arXiv:1801.07736 . Dan Hendrycks and Kevin Gimpel. 2016. Bridging
nonlinearities and stochastic regularizers with gaussian error linear
units. CoRR , abs\/1606.08415. Felix Hill, Kyunghyun Cho, and Anna
Korhonen. 2016. Learning distributed representations of sentences from
unlabelled data. In Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computational Linguistics: Human
Language Technologies . Association for Computa- tional Linguistics.
Jeremy Howard and Sebastian Ruder. 2018. Universal language model
\ufb01ne-tuning for text classi\ufb01cation. In ACL. Association for
Computational Linguistics. Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng
Qiu, Furu Wei, and Ming Zhou. 2018. Reinforced mnemonic reader for machine
reading comprehen- sion. In IJCAI . Yacine Jernite, Samuel R. Bowman, and
David Son- tag. 2017. Discourse-based objectives for fast un- supervised
sentence representation learning. CoRR , abs\/1705.00557.Mandar Joshi,
Eunsol Choi, Daniel S Weld, and Luke Zettlemoyer. 2017. Triviaqa: A large
scale distantly supervised challenge dataset for reading comprehen- sion.
In ACL. Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel,
Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought
vectors. In Advances in neural information processing systems , pages
3294\u20133302. Quoc Le and Tomas Mikolov. 2014. Distributed rep-
resentations of sentences and documents. In Inter- national Conference on
Machine Learning , pages 1188\u20131196. Hector J Levesque, Ernest Davis,
and Leora Morgen- stern. 2011. The winograd schema challenge. In Aaai
spring symposium: Logical formalizations of commonsense reasoning , volume
46, page 47. Lajanugen Logeswaran and Honglak Lee. 2018. An ef\ufb01cient
framework for learning sentence represen- tations. In International
Conference on Learning Representations . Bryan McCann, James Bradbury,
Caiming Xiong, and Richard Socher. 2017. Learned in translation:
Con-","tool for measuring readability. Journalism Bulletin ,
30(4):415\u2013433. Erik F Tjong Kim Sang and Fien De Meulder. 2003.
Introduction to the conll-2003 shared task: Language-independent named
entity recognition. In CoNLL . Joseph Turian, Lev Ratinov, and Yoshua
Bengio. 2010. Word representations: A simple and general method for
semi-supervised learning. In Proceedings of the 48th Annual Meeting of the
Association for Compu- tational Linguistics , ACL \u201910, pages
384\u2013394. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit,
Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017.
Attention is all you need. In Advances in Neural Information Pro- cessing
Systems , pages 6000\u20136010. Pascal Vincent, Hugo Larochelle, Yoshua
Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and composing robust
features with denoising autoen- coders. In Proceedings of the 25th
international conference on Machine learning , pages 1096\u20131103. ACM.
Alex Wang, Amanpreet Singh, Julian Michael, Fe- lix Hill, Omer Levy, and
Samuel Bowman. 2018a. Glue: A multi-task benchmark and analysis
platformfor natural language understanding. In Proceedings of the 2018
EMNLP Workshop BlackboxNLP: An- alyzing and Interpreting Neural Networks
for NLP , pages 353\u2013355. Wei Wang, Ming Yan, and Chen Wu. 2018b.
Multi- granularity hierarchical attention fusion networks for reading
comprehension and question answering. InProceedings of the 56th Annual
Meeting of the As- sociation for Computational Linguistics (Volume 1: Long
Papers) . Association for Computational Lin- guistics. Alex Warstadt,
Amanpreet Singh, and Samuel R Bow- man. 2018. Neural network acceptability
judg- ments. arXiv preprint arXiv:1805.12471 . Adina Williams, Nikita
Nangia, and Samuel R Bow- man. 2018. A broad-coverage challenge corpus for
sentence understanding through inference. In NAACL . Yonghui Wu, Mike
Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey,
Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016.
Google\u2019s neural ma- chine translation system: Bridging the gap
between human and machine translation. arXiv preprint arXiv:1609.08144 .
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How
transferable are features in deep neural networks? In Advances in neural
information processing systems , pages 3320\u20133328. Adams Wei Yu, David
Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V
Le. 2018. QANet: Combining local convolution with global self-attention
for reading comprehen- sion. In ICLR . Rowan Zellers, Yonatan Bisk, Roy
Schwartz, and Yejin Choi. 2018. Swag: A large-scale adversarial dataset
for grounded commonsense inference. In Proceed- ings of the 2018
Conference on Empirical Methods in Natural Language Processing (EMNLP) .
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhut- dinov, Raquel Urtasun,
Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies:
Towards story-like visual explanations by watching movies and reading
books. In Proceedings of the IEEE international conference on computer
vision , pages 19\u201327. Appendix for \u201cBERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding\u201d We organize
the appendix into three sections: \u2022 Additional implementation details
for BERT are presented in Appendix A;\u2022 Additional details for our
experiments are presented in Appendix B; and \u2022 Additional ablation
studies are presented in Appendix C. We present additional ablation
studies for BERT including: \u2013Effect of Number of Training Steps; and
\u2013Ablation for Different"]} {"question": "What is the main difference
between BERT and previous language representation models?", "variant_id":
"v1", "line_number": 2, answer":"BERT is designed to pre-train deep
bidirectional representations from unlabeled text by jointly conditioning
on both left and right context in all layers, allowing it to incorporate
context from both directions. This is unlike previous language
representation models that are unidirectional, which limits the choice of
architectures that can be used during pre-training and could be
sub-optimal for sentence-level tasks and token-level tasks such as
question answering.","context":["BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton
Lee Kristina Toutanova Google AI Language
fjacobdevlin,mingweichang,kentonl,kristout [email protected] Abstract We
introduce a new language representa- tion model called BERT , which stands
for Bidirectional Encoder Representations from Transformers. Unlike recent
language repre- sentation models (Peters et al., 2018a; Rad- ford et al.,
2018), BERT is designed to pre- train deep bidirectional representations
from unlabeled text by jointly conditioning on both left and right context
in all layers. As a re- sult, the pre-trained BERT model can be \ufb01ne-
tuned with just one additional output layer to create state-of-the-art
models for a wide range of tasks, such as question answering and language
inference, without substantial task- speci\ufb01c architecture
modi\ufb01cations. BERT is conceptually simple and empirically powerful.
It obtains new state-of-the-art re- sults on eleven natural language
processing tasks, including pushing the GLUE score to 80.5% (7.7% point
absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute
improvement), SQuAD v1.1 question answer- ing Test F1 to 93.2 (1.5 point
absolute im- provement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute
improvement). 1 Introduction Language model pre-training has been shown to
be effective for improving many natural language processing tasks (Dai and
Le, 2015; Peters et al., 2018a; Radford et al., 2018; Howard and Ruder,
2018). These include sentence-level tasks such as natural language
inference (Bowman et al., 2015; Williams et al., 2018) and paraphrasing
(Dolan and Brockett, 2005), which aim to predict the re- lationships
between sentences by analyzing them holistically, as well as token-level
tasks such as named entity recognition and question answering, where
models are required to produce \ufb01ne-grained output at the token level
(Tjong Kim Sang and De Meulder, 2003; Rajpurkar et al., 2016).There are
two existing strategies for apply- ing pre-trained language
representations to down- stream tasks: feature-based and\ufb01ne-tuning .
The feature-based approach, such as ELMo (Peters et al., 2018a), uses
task-speci\ufb01c architectures that include the pre-trained
representations as addi- tional features. The \ufb01ne-tuning approach,
such as the Generative Pre-trained Transformer (OpenAI GPT) (Radford et
al., 2018), introduces minimal task-speci\ufb01c parameters, and is
trained on the downstream tasks by simply \ufb01ne-tuning allpre- trained
parameters. The two approaches share the same objective function during
pre-training, where they use unidirectional language models to learn
general language representations. We argue that current techniques
restrict the power of the pre-trained representations, espe- cially for
the \ufb01ne-tuning approaches. The ma- jor limitation is that standard
language models are unidirectional, and this limits the choice of archi-
tectures that can be used during pre-training. For example, in OpenAI GPT,
the authors use a left-to- right architecture, where every token can only
at- tend to previous tokens in the self-attention layers of the
Transformer (Vaswani et al., 2017). Such re- strictions are sub-optimal
for sentence-level tasks, and could be very harmful when applying
\ufb01ne- tuning based approaches to token-level tasks such as question
answering, where it is crucial to incor- porate context from both
directions. In this paper, we improve the \ufb01ne-tuning based approaches
by proposing BERT: Bidirectional Encoder Representations from
Transformers.","the self-attention layers of the Transformer (Vaswani et
al., 2017). Such re- strictions are sub-optimal for sentence-level tasks,
and could be very harmful when applying \ufb01ne- tuning based approaches
to token-level tasks such as question answering, where it is crucial to
incor- porate context from both directions. In this paper, we improve the
\ufb01ne-tuning based approaches by proposing BERT: Bidirectional Encoder
Representations from Transformers. BERT alleviates the previously
mentioned unidi- rectionality constraint by using a \u201cmasked lan-
guage model\u201d (MLM) pre-training objective, in- spired by the Cloze
task (Taylor, 1953). The masked language model randomly masks some of the
tokens from the input, and the objective is to predict the original
vocabulary id of the maskedarXiv:1810.04805v2 [cs.CL] 24 May 2019word
based only on its context. Unlike left-to- right language model
pre-training, the MLM ob- jective enables the representation to fuse the
left and the right context, which allows us to pre- train a deep
bidirectional Transformer. In addi- tion to the masked language model, we
also use a \u201cnext sentence prediction\u201d task that jointly pre-
trains text-pair representations. The contributions of our paper are as
follows: \u2022 We demonstrate the importance of bidirectional
pre-training for language representations. Un- like Radford et al. (2018),
which uses unidirec- tional language models for pre-training, BERT uses
masked language models to enable pre- trained deep bidirectional
representations. This is also in contrast to Peters et al. (2018a), which
uses a shallow concatenation of independently trained left-to-right and
right-to-left LMs. \u2022 We show that pre-trained representations reduce
the need for many heavily-engineered task- speci\ufb01c architectures.
BERT is the \ufb01rst \ufb01ne- tuning based representation model that
achieves state-of-the-art performance on a large suite of sentence-level
andtoken-level tasks, outper- forming many task-speci\ufb01c
architectures. \u2022 BERT advances the state of the art for eleven NLP
tasks. The code and pre-trained mod- els are available at
https:\/\/github.com\/ google-research\/bert . 2 Related Work There is a
long history of pre-training general lan- guage representations, and we
brie\ufb02y review the most widely-used approaches in this section. 2.1
Unsupervised Feature-based Approaches Learning widely applicable
representations of words has been an active area of research for decades,
including non-neural (Brown et al., 1992; Ando and Zhang, 2005; Blitzer et
al., 2006) and neural (Mikolov et al., 2013; Pennington et al., 2014)
methods. Pre-trained word embeddings are an integral part of modern NLP
systems, of- fering signi\ufb01cant improvements over embeddings learned
from scratch (Turian et al., 2010). To pre- train word embedding vectors,
left-to-right lan- guage modeling objectives have been used (Mnih and
Hinton, 2009), as well as objectives to dis- criminate correct from
incorrect words in left and right context (Mikolov et al., 2013).These
approaches have been generalized to coarser granularities, such as
sentence embed- dings (Kiros et al., 2015; Logeswaran and Lee, 2018) or
paragraph embeddings (Le and Mikolov, 2014). "]'
outputs:
groundedness:
type: string
reference: ${parse_score.output}
nodes:
- name: parse_score
type: python
source:
type: code
path: calc_groundedness.py
inputs:
gpt_score: ${gpt_groundedness.output}
- name: aggregate
type: python
source:
type: code
path: aggregate.py
inputs:
groundedness_scores: ${parse_score.output}
aggregation: true
- name: gpt_groundedness
type: llm
source:
type: code
path: gpt_groundedness.md
inputs:
# This is to easily switch between openai and azure openai.
# deployment_name is required by azure openai, model is required by openai.
deployment_name: gpt-35-turbo
model: gpt-3.5-turbo
max_tokens: 5
answer: ${inputs.answer}
question: ${inputs.question}
context: ${inputs.context}
temperature: 0
connection: open_ai_connection
api: chat