Releases: allenai/allennlp
v2.4.0
What's new
Added 🎉
- Added a T5 implementation to
modules.transformers
.
Changed ⚠️
- Weights & Biases callback can now work in anonymous mode (i.e. without the
WANDB_API_KEY
environment variable).
Fixed ✅
- The
GradientDescentTrainer
no longer leaves stray model checkpoints around when it runs out of patience. - Fixed
cached_path()
for "hf://" files.
Commits
7c5cc98 Don't orphan checkpoints when we run out of patience (#5142)
6ec6459 allow W&B anon mode (#5110)
4e862a5 T5 (#4969)
7fc5a91 fix cached_path for hub downloads (#5141)
f877fdc Fairness Metrics (#5093)
v2.3.1
What's new
Added 🎉
- Added support for the HuggingFace Hub as an alternative way to handle loading files through
cached_path()
. Hub downloads should be made through thehf://
URL scheme. - Add new dimension to the
interpret
module: influence functions via theInfluenceInterpreter
base class, along with a concrete implementation:SimpleInfluence
. - Added a
quiet
parameter to theMultiProcessDataLoading
that disablesTqdm
progress bars. - The test for distributed metrics now takes a parameter specifying how often you want to run it.
Changed ⚠️
- Updated CONTRIBUTING.md to remind reader to upgrade pip setuptools to avoid spaCy installation issues.
Fixed ✅
- Fixed a bug with the
ShardedDatasetReader
when used with multi-process data loading (#5132).
Commits
a84b9b1 Add cached_path support for HF hub (#5052)
24ec7db fix #5132 (#5134)
2526674 Update CONTRIBUTING.md (#5133)
c2ffb10 Add influence functions to interpret module (#4988)
0c7d60b Take the number of runs in the test for distributed metrics (#5127)
8be3828 fix docs CI
v2.3.0
What's new
Added 🎉
- Ported the following Huggingface
LambdaLR
-based schedulers:ConstantLearningRateScheduler
,ConstantWithWarmupLearningRateScheduler
,CosineWithWarmupLearningRateScheduler
,CosineHardRestartsWithWarmupLearningRateScheduler
. - Added new
sub_token_mode
parameter topretrained_transformer_mismatched_embedder
class to support first sub-token embedding - Added a way to run a multi task model with a dataset reader as part of
allennlp predict
. - Added new
eval_mode
inPretrainedTransformerEmbedder
. If it is set toTrue
, the transformer is always run in evaluation mode, which, e.g., disables dropout and does not update batch normalization statistics. - Added additional parameters to the W&B callback:
entity
,group
,name
,notes
, andwandb_kwargs
.
Changed ⚠️
- Sanity checks in the
GradientDescentTrainer
can now be turned off by setting therun_sanity_checks
parameter toFalse
. - Allow the order of examples in the task cards to be specified explicitly
histogram_interval
parameter is now deprecated inTensorboardWriter
, please usedistribution_interval
instead.- Memory usage is not logged in tensorboard during training now.
ConsoleLoggerCallback
should be used instead. - If you use the
min_count
parameter of the Vocabulary, but you specify a namespace that does not exist, the vocabulary creation will raise aConfigurationError
. - Documentation updates made to SoftmaxLoss regarding padding and the expected shapes of the input and output tensors of
forward
. - Moved the data preparation script for coref into allennlp-models.
- If a transformer is not in cache but has override weights, the transformer's pretrained weights are no longer downloaded, that is, only its
config.json
file is downloaded. SanityChecksCallback
now raisesSanityCheckError
instead ofAssertionError
when a check fails.jsonpickle
removed from dependencies.- Improved the error message from
Registrable.by_name()
when the name passed does not match any registered subclassess.
The error message will include a suggestion if there is a close match between the name passed and a registered name.
Fixed ✅
- Fixed a bug where some
Activation
implementations could not be pickled due to involving a lambda function. - Fixed
__str__()
method onModelCardInfo
class. - Fixed a stall when using distributed training and gradient accumulation at the same time
- Fixed an issue where using the
from_pretrained_transformer
Vocabulary
constructor in distributed training via theallennlp train
command
would result in the data being iterated through unnecessarily. - Fixed a bug regarding token indexers with the
InterleavingDatasetReader
when used with multi-process data loading. - Fixed a warning from
transformers
when usingmax_length
in thePretrainedTransformerTokenizer
.
Removed 👋
- Removed the
stride
parameter toPretrainedTransformerTokenizer
. This parameter had no effect.
Commits
c80e175 improve error message from Registrable class (#5125)
aca1623 Update docstring for basic_classifier (#5124)
059a64f remove jsonpickle from dependencies (#5121)
5fdce9a fix bug with interleaving dataset reader (#5122)
6e1f34c Predicting with a dataset reader on a multitask model (#5115)
b34df73 specify 'truncation' to avoid transformers warning (#5120)
0ddd3d3 Add eval_mode argument to pretrained transformer embedder (#5111)
99415e3 additional W&B params (#5114)
6ee1212 Adding a metadata field to the basic classifier (#5104)
2e8c3e2 Add link to gallery and demo in README (#5103)
de61100 Distributed training with gradient accumulation (#5100)
fe2d6e5 vocab fix (#5099)
d906175 Update transformers requirement from <4.5,>=4.1 to >=4.1,<4.6 (#5102)
99da315 fix str method of ModelCardInfo (#5096)
29f00ee Added new parameter 'sub_token_mode' to 'pretrained_transformer_mismatched_embedder' class to support first sub-token embedding (#4363) (#5087)
6021f7d Avoid from_pretrained download of model weights (#5085)
c3fb97e add SanityCheckError class (#5092)
decb875 Bring back run_sanity_checks
parameter (#5091)
913fb8a Update mkdocs-material requirement from <7.1.0,>=5.5.0 to >=5.5.0,<7.2.0 (#5074)
f82d3f1 remove lambdas from activations (#5083)
bb70349 Replace master references with main in issue template (#5084)
87504c4 Ported Huggingface LambdaLR-based schedulers (#5082)
63a3b48 set transformer to evaluation mode (#5073)
542ce5d Move coref prep script (#5078)
bf8e71e compare namespace in counter and min_count (#3644)
4baf19a Arjuns/softmax loss documentation update (#5075)
59b9210 Allow example categories to be ordered (#5059)
3daa0ba tick version for nightly
bb77bd1 fix date in CHANGELOG
v2.2.0
What's new
Added 🎉
- Added
WandBCallback
class for Weights & Biases integration, registered as a callback under the name "wandb". - Added
TensorBoardCallback
to replace theTensorBoardWriter
. Registered as a callback
under the name "tensorboard". - Added
NormalizationBiasVerification
andSanityChecksCallback
for model sanity checks. SanityChecksCallback
runs by default from theallennlp train
command.
It can be turned off by settingtrainer.enable_default_callbacks
tofalse
in your config.- Added new method on
Field
class:.human_readable_repr() -> Any
, and new method onInstance
class:.human_readable_dict() -> JsonDict
(@leo-liuzy).
Removed 👋
- Removed
TensorBoardWriter
. Please use theTensorBoardCallback
instead.
Changed ⚠️
- Use attributes of
ModelOutputs
object inPretrainedTransformerEmbedder
instead of indexing (@JohnGiorgi). - Added support for PyTorch version 1.8 and
torchvision
version 0.9 (@nelson-liu). Model.get_parameters_for_histogram_tensorboard_logging
is deprecated in favor of
Model.get_parameters_for_histogram_logging
.
Fixed ✅
- Makes sure tensors that are stored in
TensorCache
always live on CPUs. - Fixed a bug where
FromParams
objects wrapped inLazy()
couldn't be pickled. - Fixed a bug where the
ROUGE
metric couldn't be picked. - Fixed a bug reported by #5036 - we now keep our spacy POS tagger on (@leo-liuzy).
Commits
c5c9df5 refactor LogWriter, add W&B integration (#5061)
385124a Keep Spacy PoS tagger on by default (#5066)
15b532f Update transformers requirement from <4.4,>=4.1 to >=4.1,<4.5 (#5057)
3aafb92 clarify how predictions_to_labeled_instances
work for targeted or non-targeted hotflip attack (#4957)
b897e57 ensure ROUGE metric can be pickled (#5051)
91e4af9 fix pickle bug for Lazy FromParams (#5049)
5b57be2 Adding normalization bias verification (#4990)
ce71901 Update torchvision requirement from <0.9.0,>=0.8.1 to >=0.8.1,<0.10.0 (#5041)
7f60990 Update torch requirement from <1.8.0,>=1.6.0 to >=1.6.0,<1.9.0 (#5037)
96415b2 Use HF Transformers output types (#5035)
0c36019 clean up (#5034)
d2bf35d Add methods for human readable representation of fields and instances (#4986)
a8b8006 Makes sure serialized tensors live on CPUs (#5026)
a0edfae Add options to log inputs in trainer (#4970)
Thanks to @nelson-liu for making sure we stay on top of releases! 😜
v1.5.0
What's new
Added 🎉
- Added a way to specify extra parameters to the predictor in an
allennlp predict
call. - Added a way to initialize a
Vocabulary
from transformers models. - Support spaCy v3
Changed ⚠️
- Updated
Paper
andDataset
classes inModelCard
.
Commits
55ac96a re-write docs commit history on releases (#4968)
c61178f Update spaCy to 3.0 (#4953)
be595df Ensure mean absolute error metric returns a float (#4983)
2556223 raise on HTTP errors in cached_path (#4984)
e1839cf Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975)
b5b72a0 Add method to vocab to instantiate from a pretrained transformer (#4958)
025a0b2 Allows specifying extra arguments for predictors (#4947)
24c9c99 adding ModelUsage, rearranging fields (#4952)
v2.1.0
What's new
Changed ⚠️
coding_scheme
parameter is now deprecated inConll2003DatasetReader
, please useconvert_to_coding_scheme
instead.- Support spaCy v3
Added 🎉
- Added
ModelUsage
toModelCard
class. - Added a way to specify extra parameters to the predictor in an
allennlp predict
call. - Added a way to initialize a
Vocabulary
from transformers models. - Added the ability to use
Predictors
with multitask models through the newMultiTaskPredictor
. - Added an example for fields of type
ListField[TextField]
toapply_token_indexers
API docs. - Added
text_key
andlabel_key
parameters toTextClassificationJsonReader
class. - Added
MultiOptimizer
, which allows you to use different optimizers for different parts of your model.
Fixed ✅
@Registrable.register(...)
decorator no longer masks the decorated class's annotations- Ensured that
MeanAbsoluteError
always returns afloat
metric value instead of aTensor
. - Learning rate schedulers that rely on metrics from the validation set were broken in v2.0.0. This
brings that functionality back. - Fixed a bug where the
MultiProcessDataLoading
would crash whennum_workers > 0
,start_method = "spawn"
,max_instances_in_memory not None
, andbatches_per_epoch not None
. - Fixed documentation and validation checks for
FBetaMultiLabelMetric
. - Fixed handling of HTTP errors when fetching remote resources with
cached_path()
. Previously the content would be cached even when
certain errors - like 404s - occurred. Now anHTTPError
will be raised whenever the HTTP response is not OK. - Fixed a bug where the
MultiTaskDataLoader
would crash whennum_workers > 0
- Fixed an import error that happens when PyTorch's distributed framework is unavailable on the system.
Commits
7c6adef Fix worker_info bug when num_workers > 0 (#5013)
9d88f8c Fixes predictors in the multitask case (#4991)
678518a Less opaque registrable annotations (#5010)
4b5fad4 Regex optimizer (#4981)
f091cb9 fix error when torch.distributed not available (#5011)
5974f54 Revert "drop support for Python 3.6 (#5012)" (#5016)
bdb0e20 Update mkdocs-material requirement from <6.3.0,>=5.5.0 to >=5.5.0,<7.1.0 (#5015)
d535de6 Bump mypy from 0.800 to 0.812 (#5007)
099786c Update responses requirement, remove pin on urllib3 (#4783)
b8cfb95 re-write docs commit history on releases (#4968)
c5c9edf Add text_key and label_key to TextClassificationJsonReader (#5005)
a02f67d drop support for Python 3.6 (#5012)
0078c59 Update spaCy to 3.0 (#4953)
be9537f Update CHANGELOG.md
828ee10 Update CHANGELOG.md
1cff6ad update README (#4993)
f8b3807 Add ListField example to apply token indexers (#4987)
7961b8b Ensure mean absolute error metric returns a float (#4983)
da4dba1 raise on HTTP errors in cached_path (#4984)
d4926f5 Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975)
d2ae540 Update transformers requirement from <4.3,>=4.1 to >=4.1,<4.4 (#4967)
bf8eeaf Add method to vocab to instantiate from a pretrained transformer (#4958)
9267ce7 Resize transformers word embeddings layer for additional_special_tokens (#4946)
52c23dd Introduce convert_to_coding_scheme
and make coding_scheme
deprecated in CoNLL2003DatasetReader (#4960)
c418f84 Fixes recording validation metrics for learning rate schedulers that rely on it (#4959)
4535f5c adding ModelUsage, rearranging fields (#4952)
1ace4bb fix bug with MultiProcessDataLoader (#4956)
6f22291 Allows specifying extra arguments for predictors (#4947)
2731db1 tick version for nightly release
v2.0.1
What's new
A couple minors fixes and additions since the 2.0 release.
Added 🎉
- Added
tokenizer_kwargs
andtransformer_kwargs
arguments toPretrainedTransformerBackbone
Changed ⚠️
- GradientDescentTrainer makes
serialization_dir
when it's instantiated, if it doesn't exist.
Fixed ✅
common.util.sanitize
now handles sets.
Commits
caa497f Update GradientDescentTrainer
to automatically create directory for serialization_dir
(#4940)
cd96d95 Sanitize set (#4945)
f0ae9f3 Adding tokenizer_kwargs argument to PretrainedTransformerBackbone constructor. (#4944)
501b0ab Fixing papers and datasets (#4919)
fa625ec Adding missing transformer_kwargs arg that was recently added to PretrainedTransformerEmbedder (#4941)
96ea483 Add missing "Unreleased" section to CHANGELOG
v1.4.1
v2.0.0
AllenNLP v2.0.0 Release Notes
The 2.0 release of AllenNLP represents a major engineering effort that brings several exciting new features to the library, as well as a focus on performance.
If you're upgrading from AllenNLP 1.x, we encourage you to read our comprehensive upgrade guide.
Main new features
AllenNLP gets eyes 👀
One of the most exciting areas in ML research is multimodal learning, and AllenNLP is now taking its first steps in this direction with support for 2 tasks and 3 datasets in the vision + text domain. Check out our ViLBERT for VQA and Visual Entailment models, along with the VQAv2, Visual Entailment, and GQA dataset readers in allennlp-models
.
Transformer toolkit
The transformer toolkit offers a collection of modules to experiment with various transformer architectures, such as SelfAttention
, TransformerEmbeddings
, TransformerLayer
, etc. It also simplifies the way one can take apart the pretrained transformer weights for an existing module, and combine them in different ways. For instance, one can pull out the first 8 layers of bert-base-uncased
to separately encode two text inputs, combine the representations in some way, and then use the last 4 layers on the combined representation (More examples can be found in allennlp.modules.transformer
).
The toolkit also contains modules for bimodal architectures such as ViLBERT. Modules include BiModalEncoder
, which encodes two modalities separately, and performs bi-directional attention (BiModalAttention
) using a connection layer (BiModalConnectionLayer
). The VisionTextModel
class is an example of a model that uses these bimodal layers.
Multi-task learning
2.0 adds support for multi-task learning throughout the AllenNLP system. In multi-task learning, the model consists of a backbone that is common to all the tasks, and tends to be the larger part of the model, and multiple task-specific heads that use the output of the backbone to make predictions for a specific task. This way, the backbone gets many more training examples than you might have available for a single task, and can thus produce better representations, which makes all tasks benefit. The canonical example for this is BERT, where the backbone is made up of the transformer stack, and then there are multiple model heads that do classification, tagging, masked-token prediction, etc. AllenNLP 2.0 helps you build such models by giving you those abstractions. The MultiTaskDatasetReader
can read datasets for multiple tasks at once. The MultiTaskDataloader
loads the instances from the reader and makes batches. The trainer feeds these batches to a MultiTaskModel
, which consists of a Backbone
and multiple Head
s. If you want to look at the details of how this works, we have an example config available at https://github.com/allenai/allennlp-models/blob/main/training_config/vision/vilbert_multitask.jsonnet.
Changes since v2.0.0rc1
Added 🎉
- The
TrainerCallback
constructor acceptsserialization_dir
provided byTrainer
. This can be useful forLogger
callbacks those need to store files in the run directory. - The
TrainerCallback.on_start()
is fired at the start of the training. - The
TrainerCallback
event methods now accept**kwargs
. This may be useful to maintain backwards-compability of callbacks easier in the future. E.g. we may decide to pass the exception/traceback object in case of failure toon_end()
and this older callbacks may simply ignore the argument instead of raising aTypeError
. - Added a
TensorBoardCallback
which wraps theTensorBoardWriter
.
Changed ⚠️
- The
TrainerCallack.on_epoch()
does not fire withepoch=-1
at the start of the training.
Instead,TrainerCallback.on_start()
should be used for these cases. TensorBoardBatchMemoryUsage
is converted fromBatchCallback
intoTrainerCallback
.TrackEpochCallback
is converted fromEpochCallback
intoTrainerCallback
.Trainer
can accept callbacks simply with namecallbacks
instead oftrainer_callbacks
.TensorboardWriter
renamed toTensorBoardWriter
, and removed as an argument to theGradientDescentTrainer
.
In order to enable TensorBoard logging during training, you should utilize theTensorBoardCallback
instead.
Removed 👋
- Removed
EpochCallback
,BatchCallback
in favour ofTrainerCallback
.
The metaclass-wrapping implementation is removed as well. - Removed the
tensorboard_writer
parameter toGradientDescentTrainer
. You should use theTensorBoardCallback
now instead.
Fixed ✅
- Now Trainer always fires
TrainerCallback.on_end()
so all the resources can be cleaned up properly. - Fixed the misspelling, changed
TensoboardBatchMemoryUsage
toTensorBoardBatchMemoryUsage
. - We set a value to
epoch
so in case of firingTrainerCallback.on_end()
the variable is bound.
This could have lead to an error in case of trying to recover a run after it was finished training.
Commits since v2.0.0rc1
1530082 Log to TensorBoard through a TrainerCallback in GradientDescentTrainer (#4913)
8b95316 ci quick fix
fa1dc7b Add link to upgrade guide to README (#4934)
7364da0 Fix parameter name in the documentation
00e3ff2 tick version for nightly release
67fa291 Merging vision into main (#4800)
65e50b3 Bump mypy from 0.790 to 0.800 (#4927)
a744535 fix mkdocs config (#4923)
ed322eb A helper for distributed reductions (#4920)
9ab2bf0 add CUDA 10.1 Docker image (#4921)
d82287e Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872)
4183a49 Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880)
54e85ee disable codecov annotations (#4902)
2623c4b Making TrackEpochCallback an EpochCallback (#4893)
1d21c75 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867)
ec197c3 Create pull_request_template.md (#4891)
9cf41b2 fix navbar link
9635af8 rename 'master' -> 'main' (#4887)
d0a07fb docs: fix simple typo, multplication -> multiplication (#4883)
d1f032d Moving modelcard and taskcard abstractions to main repo (#4881)
1fff7ca Update docker torch version (#4873)
d2aea97 Fix typo in str (#4874)
6a8d425 add CombinedLearningRateScheduler (#4871)
a3732d0 Fix cache volume (#4869)
832901e Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)
v1.4.0
What's new
Added 🎉
- Added a
FileLock
class tocommon.file_utils
. This is just like theFileLock
from thefilelock
library, except that
it adds an optional flagread_only_ok: bool
, which when set toTrue
changes the behavior so that a warning will be emitted
instead of an exception when lacking write permissions on an existing file lock.
This makes it possible to use theFileLock
class on a read-only file system. - Added a new learning rate scheduler:
CombinedLearningRateScheduler
. This can be used to combine different LR schedulers, using one after the other. - Added an official CUDA 10.1 Docker image.
- Moving
ModelCard
andTaskCard
abstractions into the main repository. - Added a util function
allennlp.nn.util.dist_reduce(...)
for handling distributed reductions.
This is especially useful when implementing a distributedMetric
.
Changed ⚠️
- 'master' branch renamed to 'main'
- Torch version bumped to 1.7.1 in Docker images.
Fixed ✅
- Fixed typo with
LabelField
string representation: removed trailing apostrophe. Vocabulary.from_files
andcached_path
will issue a warning, instead of failing, when a lock on an existing resource
can't be acquired because the file system is read-only.TrackEpochCallback
is now aEpochCallback
.
Commits
4de78ac Make CI run properly on the 1.x branch
65e50b3 Bump mypy from 0.790 to 0.800 (#4927)
a744535 fix mkdocs config (#4923)
ed322eb A helper for distributed reductions (#4920)
9ab2bf0 add CUDA 10.1 Docker image (#4921)
d82287e Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872)
4183a49 Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880)
54e85ee disable codecov annotations (#4902)
2623c4b Making TrackEpochCallback an EpochCallback (#4893)
1d21c75 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867)
ec197c3 Create pull_request_template.md (#4891)
9cf41b2 fix navbar link
9635af8 rename 'master' -> 'main' (#4887)
d0a07fb docs: fix simple typo, multplication -> multiplication (#4883)
d1f032d Moving modelcard and taskcard abstractions to main repo (#4881)
1fff7ca Update docker torch version (#4873)
d2aea97 Fix typo in str (#4874)
6a8d425 add CombinedLearningRateScheduler (#4871)
a3732d0 Fix cache volume (#4869)
832901e Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)