Text_Summarizer_PyTorch

This project aims to build a text summarizer with frontend using PyTorch and Hugging Face.

Instructions to use

Pull the image from DockerHub:

docker pull antonbeloval08/text-summarizer

OR

Build the image yourself:

git clone https://github.com/Commit2Cosmos/Text_Summarizer_PyTorch.git
docker build --no-cache -t text-summarizer .

Create a container:

docker run --name text-sumarizer-test -p 8000:8000 text-summarizer

Visit the webapp: http://0.0.0.0:8000
To train: Click on the "Training" section, press "Try it out" and press "Execute"
To summarize: Click on the "Inference" section, press "Try it out", enter the text you want summarized and press "Execute"

-------------------------- DEV NOTES --------------------------

TODO

Add ability to control parameters in params.json in the web app
Change training + evaluation components to work with multiple datasets (dynamic saving file paths etc)
Resolve Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file
Raise issue of the incorrect warning about that model needs to be trained because of newly initialised encoding layers

Milestones

Choose and download the pre-trained model and dataset for transfer learning
Build pipelines (listed below)
Check for transfer learning or fine-tuning (Untrained layers are already provided, so just train them -> update training pipeline)
Model packaging (serialisation, containerisation) -> Docker
Choose deployment strategy (cloud or local) and interaction type (API, webapp, cli, embedded systems)
Train the model with VertexAI (separate data and model -> don't use the container) and upload trained weights

(Optional):

Build + deploy custom webapp
Support for multiple datasets
Add output text size control feature
Add context area for user defined personalisations
Add support for pdf (and other) files (multimodality)

Pipelines

Workflow (Files to update)

See architecture file for detailed breakdown of the project's architecture and what each file does.

logging.py
pyproject.toml
params.json
config.json
src/entity
src/config
src/components
src/pipeline
main.py
app.py

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
architecture		architecture
config		config
src/textSummarizer		src/textSummarizer
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
app.py		app.py
main.py		main.py
params.json		params.json
pyproject.toml		pyproject.toml
setup.py		setup.py
submit_job_gcr.py		submit_job_gcr.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text_Summarizer_PyTorch

Instructions to use

TODO

Milestones

Pipelines

Workflow (Files to update)

About

Releases

Packages

Languages

Commit2Cosmos/Text_Summarizer_PyTorch

Folders and files

Latest commit

History

Repository files navigation

Text_Summarizer_PyTorch

Instructions to use

TODO

Milestones

Pipelines

Workflow (Files to update)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages