Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/paper-submission'
Browse files Browse the repository at this point in the history
  • Loading branch information
hatellezp committed Jul 24, 2024
2 parents 1b7e61a + 49810ee commit f6ecce7
Show file tree
Hide file tree
Showing 25 changed files with 652 additions and 472 deletions.
150 changes: 73 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,131 +6,92 @@ Models can be trained using a mix of real and generated data. They can also be l

<img src="docs/images/general_pipeline.png" />

Make sure to install the requirements :

## Installation

We recommend using a virtual environment. Install requirements :

```
pip install -r requirements.txt
```

We recommend using a virtual environment.

## Datasets

We experimented with PEOPLE from the [COCO](https://cocodataset.org/#home) datasets :
- [COCO](https://cocodataset.org/#home) PEOPLE dataset :

```
./run.sh coco
```

- [Flickr30K Entities](https://bryanplummer.com/Flickr30kEntities/) PEOPLE dataset :

```
./run.sh flickr30k
```


Data will be downloaded and put in the respective files for images, labels and captions.

## Generate test images
## Generate images

To generate some images for the moment you can use
To generate some images, you can use

```bash
./run.sh gen
```

See the `conf/config.yaml` file for more details, you can configure your run.
See the `conf/config.yaml` file for all details and configuration options.

You can also configure directly on the command line :

```bash
./run.sh gen model.cn_use=openpose prompt.base="Trump" prompt.modifier="dancing" data_path.generated=mysupertest
./run.sh gen model.cn_use=openpose prompt.base="Arnold" prompt.modifier="dancing"
```

You will find your images in `data/mysupertest/openpose` along with the base image and the feature extracted.

<div><img width="350" src="docs/images/b_1.png"/></div>
<div><img width="350" src="docs/images/f_1.png"/></div>
If you use the `controlnet_segmentation` ControlNet, You will find your images in `data/generated/controlnet_segmentation` along with the base image and the feature extracted.

<div><img width="350" src="docs/images/1_1.png"/></div>
<div><img width="350" src="docs/images/2_1.png"/></div>
The configuration options work for all scripts available in the framework. For example, you can have different initial data sizes by controlling sample numbers :

<div><img width="350" src="docs/images/3_1.png"/></div>
<div><img width="350" src="docs/images/4_1.png"/></div>


## Multi run
```bash
./run.sh coco ml.train_nb=500
```

You can also launch multiple runs. Here's an example of a multi-run with 3 different generators :

```
./run.sh gen model.cn_use=frankjoshua_openpose,fusing_openpose,lllyasviel_openpose
```

List of available models can be found in `conf/config.yaml`. We have 3 available extractors at the moment (OpenPose, Canny, MediaPipeFace), If you add another control-net model, make sure you add one of the following strings to its name to set the extractor to use :
List of available models can be found in `conf/config.yaml`. We have 4 available extractors at the moment (Segmentation, OpenPose, Canny, MediaPipeFace), If you add another control-net model, make sure you add one of the following strings to its name to set the extractor to use :

- openpose
- canny
- segmentation
- mediapipe_face


## Test the quality of images with IQA measures

One way of testing the quality of the generated images is to use computational and statistical methods. One good library for it is [IQA-PyTroch](https://github.com/chaofengc/IQA-PyTorch), you
can go read its [paper](https://arxiv.org/pdf/2208.14818.pdf).


Because images are generated there is no reference image to compare to. We will be using with the
no-reference metrics Note that methods using here are agnostic to the content of the image, no
subjective or conceptual score is given. Measures generated here only give an idea of how
*good looking* the images are.
One way of testing the quality of the generated images is to use computational and statistical methods. One good library for it is [IQA-PyTroch](https://github.com/chaofengc/IQA-PyTorch), you can go read its [paper](https://arxiv.org/pdf/2208.14818.pdf).

Methods used:
- [brisque](https://www.sciencedirect.com/science/article/abs/pii/S0730725X17301340)
- [cliipiqa](https://arxiv.org/pdf/2207.12396.pdf)
- [dbccn](https://arxiv.org/pdf/1907.02665v1.pdf)
- [niqe](https://live.ece.utexas.edu/research/quality/nrqa.html)
There are two approaches to measure image quality
- full reference: compare against a real pristine image
- no reference: compute metrics following a learned opinion

You can use metrics in the same way the generation is done:

You can use these measures in the same way the generation is done:
```bash
./run iqa
# For paper
./run.sh src/iqa_paper.py
# Framework general script
./run.sh iqa
```
It follows the same configuration that the generation part, with the same file `conf/config.yaml`.

**Note**: `iqa` is going to search for a directory following the same naming convention that `gen.py`,
that is, the directory has the name `<name chosen by the user>_<cn model used to generate>`.
This are of course in the `config.yaml` file and can be changed statically or dynamically.
It follows the same configuration of the generation part, with the same file `conf/config.yaml`.

<img src="docs/images/iqa_measure.png" />


### iqa paper

In this file the approach to measure quality will be the extensive library
IQA-Pytorch: https://github.com/chaofengc/IQA-PyTorch
Read also the paper: https://arxiv.org/pdf/2208.14818.pdf

There are basically two approaches to measure image quality
- full reference: compare againts a real pristine image
- no reference: compute metrics following a learned opinion

Because images are generated there is no reference image to compare to. We
will be using with the no-reference metrics

Note that methods using here are agnostic to the content of the image, no
subjective or conceptual score is given.
Measures generated here only give an idea of how 'good looking' the images
are.

Methods used:
- brisque: https://www.sciencedirect.com/science/article/abs/pii/S0730725X17301340
- dbccn: https://arxiv.org/pdf/1907.02665v1.pdf
- niqe: https://live.ece.utexas.edu/research/quality/nrqa.html



dbcnn is good for: blur, contrast distortion, white and pink noise, dithering, over and under exposure
brisque is good for: distortion, luminance and blur
ilniqe: distortion, blur and compression distortion

Note that metrics have each different ranges: [0, 1], [0, +inf] and also sometimes less is better
and sometimes more is better, it would be a mistake to try to rescale or invert them.
It is better to treat each separately.

There is file created in `data/iqa/<cn_use>_iqa.json` with the following structure:

```
Expand Down Expand Up @@ -179,19 +140,54 @@ wandb login
Create `train.txt`, `val.txt`, et `test.txt` :

```
./run create_dataset
./run.sh create_dataset
```

Launch the training !

```
./run train
./run.sh train
```

You can both create and launch at the same time to be able to execute multiple training with multiple augmentation percents on your server using hydra :

```
./run.sh create_n_train.py -m ml.augmentation_percent=0.1 ml.sampling.enable=True ml.sampling.metric=dbcnn,brisque ml.sampling.sample=best ml.epochs=15
```

## Download and test models

The download folder can be set in the config file. You'll have folders inside for each wandb project. each project folder contains :

- all models for the project
- summary file with parameters and results (map, precision ..etc)
- in case of running `test.py` : results.csv file containing the test results with map and other info.

```
./run.sh download.py ml.wandb.project=your-project ml.wandb.download.download=true ml.wandb.download.list_all=true
./run.sh test.py
```
python src/download.py ml.wandb.project=your-project ml.wandb.download.download=true ml.wandb.download.list_all=true

python src/test.py
```
**Note** Other scripts exist to execute different studies, like the usage of Active learning, which is still excremental, you can check the `src` folder for those scripts (This code is still not fully integrated into the framework, some path or configuration modifications might be necessary for correct execution).

## Runs Results Plots

Here are some plots for some of the many runs and studies that we performed :

### Coco Sampling

<img src="docs/images/COCO_all_samplings_2.png" />

### Flickr Sampling

<img src="docs/images/flickr_all_samplings.png" />

### Loss values for COCO

<img src="docs/images/COCO_loss.png" />

### Random Sampling - Regular Runs

<img src="docs/images/random_sampling.png" />

82 changes: 50 additions & 32 deletions conf/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,34 @@
# If not, see <http://www.gnu.org/licenses/>.

data:
# data paths and formats configuration :
# create data folder, and all other folders will be created automatically
base: "data"
real: "real" # should contains images/, labels/, and captions/ (captions are optional)
real: "real" # Will contain images/, labels/, and captions/ (captions are optional)
generated: "generated"
datasets: "datasets"
image_formats: ["jpeg", "jpg"]

# put every parameter related to machine learning here: dataset size,
# For the future : put every parameter related to machine learning here: dataset size,
# ratio between train and test, learning rate ...
ml:
# number of samples for training, validation, and test
val_nb: 300
test_nb: 300
train_nb: 250

augmentation_percent: 0.1 # controls all aug percents parameters everywhere
augmentation_percent_baseline: 0
baseline: True
augmentation_percent_baseline: 0 # Ablation study augmentation => For paper only
epochs: 300
sampling:
sampling: # IQA Sampling, check end of this yaml file for list of available metrics
# Telling the trainer to use IQA metrics already calculated to sample on best images for training
enable: false
metric: brisque # brisque (smaller is better), dbcnn (bigger is better), ilniqe (smaller is better)
sample: best # to take smaller or bigger values is decided depending the metric

wandb:
entity: sdcn-nantes
project: sdcn-shit-testing
wandb: # SET wabdb parameters for run tracking and model logging
entity: your-wandb-username
project: sdcn-project
download:
list_all: false
list_finished: true
Expand All @@ -46,12 +50,20 @@ ml:
query_filter: false

prompt:
# if use_captions is set to 1 in "model". A vocabulary will be used
# to modify the original captions and generate newer captions
# to create multiple diverse synthetic images from the same original image
template: vocabulary
modify_captions: 1
generation_size: 10

# POSITIVE PROMPTS: THIS section is used if use_captions is set to 0 in "model"
# This section can be used if your dataset doesn't have captions already included
base: ["Sandra Oh", "Kim Kardashian", "rihanna ", "taylor swift"]
quality: "showing emotion, great realistic face, best quality, extremely detailed,"
modifier: "Happy man smiling"

# Best negative prompts were chosen for better image quality generation
negative:
[
"monochrome, lowres, bad anatomy, worst quality, low quality, cartoon, unrealistic, bad proportion,",
Expand All @@ -62,18 +74,29 @@ prompt:
"extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs,",
"extra arms, extra legs, fused fingers, too many fingers, long neck, no cars, no people, illustration, painting,",
"drawing, art, sketch, anime, deformation, distorsion",
]
]
negative_simple: "monochrome, lowres, bad anatomy, worst quality, low quality, cartoon, unrealistic, bad proportion, disfigured, mutation, deformed mouth, deformed eyes, unnatural teeth, unnatural eyes, unnatural mouth, unnatural face, unnatural face expression, not human"

model:
use_captions: 1
use_labels: 0
augmentation_percent: 0 # not used at the moment anywhere
sd: runwayml/stable-diffusion-v1-5
cn_use: controlnet_segmentation
use_captions: 1 # use captions that come with the dataset
# use_labels: 0 # not used for now, will be used in future features
sd: runwayml/stable-diffusion-v1-5 # stable diffusion version
cn_use: controlnet_segmentation # control net to use from the list
cn:
# YOU should list all the control nets that you would like to use here
# add an extractor name to the end in order to define which extractor to use :
# - _segmentation
# - _canny
# - _openpose
# - _mediapipe_face
#
# All models listed below can be found on hugging face

- controlled_false_segmentation: lllyasviel/sd-controlnet-seg # bad extractor, just for paper tests

# Segmentation
- controlnet_segmentation: lllyasviel/sd-controlnet-seg

# Canny
- lllyasviel_canny: lllyasviel/sd-controlnet-canny
- lllyasviel_scribble_canny: lllyasviel/sd-controlnet-scribble
Expand All @@ -86,28 +109,23 @@ model:
# MediaPipeFace
- crucible_mediapipe_face: CrucibleAI/ControlNetMediaPipeFace

# Coming Soon !
# - depth: lllyasviel/sd-controlnet-depth
# - hed: lllyasviel/sd-controlnet-hed
# - normal: lllyasviel/sd-controlnet-normal
# - scribble: lllyasviel/sd-controlnet-scribble
# - segmentation: lllyasviel/sd-controlnet-seg
# - mlsd: lllyasviel/sd-controlnet-mlsd

cn_extra_settings:
cn_extra_settings:
# In case your control net class takes in other parameters use this section to define them
crucible_mediapipe_face:
subfolder: diffusion_sd15
seed: 34567
device: cuda
seed: 34567 # random seed for SDCN generation
device: cuda # cpu or cuda ?

# IQA METRICS SAMPLING : Calculate all IQA metrics listed for all generated datasets
iqa:
device: cuda
metrics: [brisque, dbcnn, nima, ilniqe]
device: cuda # cpu or cuda ?
metrics: [brisque, dbcnn, nima, ilniqe] # metrics to calculate scores
# available metrics : brisque, clipiqa+, dbcnn, ilniqe, niqe, nima, cnniqa, nrqm, pi, ilniqe, niqe
# read more on : https://github.com/chaofengc/IQA-PyTorch/blob/main/docs/ModelCard.md
abled: False

active:
abled: False
rounds: 5
sel: 125 # 50 percent
# ACTIVE LEARNING SAMPLING
active:
enable: False
rounds: 5 # rounds of AL
sel: 125 # number of samples added each round
sampling: confidence # confidence, coreset, baseline (for ablation study)
Binary file removed docs/images/1_1.png
Binary file not shown.
Binary file removed docs/images/2_1.png
Binary file not shown.
Binary file removed docs/images/3_1.png
Binary file not shown.
Binary file removed docs/images/4_1.png
Binary file not shown.
Binary file removed docs/images/5_1.png
Binary file not shown.
Binary file added docs/images/COCO_all_samplings_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/COCO_loss.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/images/b_1.png
Binary file not shown.
Binary file removed docs/images/f_1.png
Binary file not shown.
Binary file added docs/images/flickr_all_samplings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/iqa_measure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/random_sampling.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit f6ecce7

Please sign in to comment.