From afbc2c5218db8b02d87942f24c398897bb95db7e Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 09:54:16 +0800 Subject: [PATCH 01/29] Update index.html --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 373119fe3..085dbb60d 100644 --- a/index.html +++ b/index.html @@ -3,7 +3,7 @@ + content="ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling."> Nerfies: Deformable Neural Radiance Fields From 503ee0706ea5f6012968a2e0be80b691f0eeaf23 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 09:56:53 +0800 Subject: [PATCH 02/29] Create static.yml --- .github/workflows/static.yml | 43 ++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 .github/workflows/static.yml diff --git a/.github/workflows/static.yml b/.github/workflows/static.yml new file mode 100644 index 000000000..f2c9e97c9 --- /dev/null +++ b/.github/workflows/static.yml @@ -0,0 +1,43 @@ +# Simple workflow for deploying static content to GitHub Pages +name: Deploy static content to Pages + +on: + # Runs on pushes targeting the default branch + push: + branches: ["main"] + + # Allows you to run this workflow manually from the Actions tab + workflow_dispatch: + +# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages +permissions: + contents: read + pages: write + id-token: write + +# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. +# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. +concurrency: + group: "pages" + cancel-in-progress: false + +jobs: + # Single deploy job since we're just deploying + deploy: + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + - name: Setup Pages + uses: actions/configure-pages@v5 + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + # Upload entire repository + path: '.' + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v4 From 333f50011cd6fefb2aa10fdf2fd09222dd9eeed6 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:04:20 +0800 Subject: [PATCH 03/29] Update index.html --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 085dbb60d..176205421 100644 --- a/index.html +++ b/index.html @@ -6,7 +6,7 @@ content="ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling."> - Nerfies: Deformable Neural Radiance Fields + ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling From 5fa9a8fbd1bef26dc6631c2f0895efcd03465319 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:06:56 +0800 Subject: [PATCH 04/29] Update index.html --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 176205421..4b4b07163 100644 --- a/index.html +++ b/index.html @@ -88,7 +88,7 @@
-

Nerfies: Deformable Neural Radiance Fields

+

ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling

Keunhong Park1, From 8ee902332fc70336011c23e8e83d12f1a93731f7 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:10:13 +0800 Subject: [PATCH 05/29] Update index.html --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 4b4b07163..7437b2f64 100644 --- a/index.html +++ b/index.html @@ -91,7 +91,7 @@

ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling

- Keunhong Park1, + Quanwei Yang1, Utkarsh Sinha2, From ecbd7e4b1ed61266e8340b37c81ce03f0735a216 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:26:45 +0800 Subject: [PATCH 06/29] Update index.html --- index.html | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/index.html b/index.html index 7437b2f64..256e3e75d 100644 --- a/index.html +++ b/index.html @@ -93,15 +93,15 @@

ShowMaker: Creating High-Fidelity 2D Hu Quanwei Yang1, - Utkarsh Sinha2, + Jianzhi Guan2, - Jonathan T. Barron2, + Kaisiyuan Wang2, - Sofien Bouaziz2, + Lingyun Yu2, - Dan B Goldman2, + Wenqing Chu2, Steven M. Seitz1,2, From da9a4b267cada3dcb7b275f240fcf2eced6cc1ea Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:31:14 +0800 Subject: [PATCH 07/29] Update index.html MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 更新作者 --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 256e3e75d..e2da921a7 100644 --- a/index.html +++ b/index.html @@ -104,7 +104,7 @@

ShowMaker: Creating High-Fidelity 2D Hu Wenqing Chu2, - Steven M. Seitz1,2, + Zhiqiang Feng1,2, Ricardo Martin-Brualla2 From c32373bd8dede857c0f632e5f02fa32269efde7f Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:37:29 +0800 Subject: [PATCH 08/29] Update index.html --- index.html | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/index.html b/index.html index e2da921a7..d7d6a4810 100644 --- a/index.html +++ b/index.html @@ -95,20 +95,30 @@

ShowMaker: Creating High-Fidelity 2D Hu Jianzhi Guan2, - Kaisiyuan Wang2, + Kaisiyuan Wang3*, - Lingyun Yu2, + Lingyun Yu1, - Wenqing Chu2, + Wenqing Chu3, - Zhiqiang Feng1,2, + Zhiqiang Feng3, - Ricardo Martin-Brualla2 + Haocheng Feng3 + + Errui Ding3 + + + Jingdong Wang3 + + + Hongtao Xie1* + +

From db7a995b7bc0e6e3a990a010f54745583bca5d9e Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:40:26 +0800 Subject: [PATCH 09/29] Update index.html --- index.html | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index d7d6a4810..7ed6cb101 100644 --- a/index.html +++ b/index.html @@ -122,8 +122,10 @@

ShowMaker: Creating High-Fidelity 2D Hu

- 1University of Washington, - 2Google Research + 1University of Science and Technology of China, + 2Tsinghua University, + 3Department of Computer Vision Technology (VIS), Baidu Inc. +
From 3ad32d4bb945024fae7be9b8d7fe38577a40647b Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:44:38 +0800 Subject: [PATCH 10/29] Update index.html --- index.html | 39 ++++++++++++++++++--------------------- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/index.html b/index.html index 7ed6cb101..5db61930c 100644 --- a/index.html +++ b/index.html @@ -270,31 +270,28 @@

Abstract

- We present the first method capable of photorealistically reconstructing a non-rigidly - deforming scene using photos/videos captured casually from mobile phones. + Although significant progress has been made in human video generation, most + previous studies focus on either human facial animation or full-body animation, + which cannot be directly applied to produce realistic conversational human videos + with frequent hand gestures and various facial movements simultaneously.

- Our approach augments neural radiance fields - (NeRF) by optimizing an - additional continuous volumetric deformation field that warps each observed point into a - canonical 5D NeRF. - We observe that these NeRF-like deformation fields are prone to local minima, and - propose a coarse-to-fine optimization method for coordinate-based models that allows for - more robust optimization. - By adapting principles from geometry processing and physical simulation to NeRF-like - models, we propose an elastic regularization of the deformation field that further - improves robustness. + To address these limitations, we propose a 2D human video generation framework, + named ShowMaker, capable of generating high-fidelity half-body conversational + videos based on 2D key points via fine-grained diffusion modeling. We leverage + dual-stream diffusion models as the backbone of our framework and carefully + design two novel components for crucial local regions (i.e., hands and face) that can + be easily integrated into our backbone. Specifically, to handle the challenging hand + generation caused by sparse motion guidance, we propose a novel Key Point-based + Fine-grained Hand Modeling module by amplifying positional information from + raw hand key points and constructing a corresponding key point-based codebook. + Moreover, to restore richer facial details in generated results, we introduce a Face + Recapture module, which extracts facial texture features and global identity features + from the aligned human face and integrates them into the diffusion process for face enhancement.

- We show that Nerfies can turn casually captured selfie - photos/videos into deformable NeRF - models that allow for photorealistic renderings of the subject from arbitrary - viewpoints, which we dub "nerfies". We evaluate our method by collecting data - using a - rig with two mobile phones that take time-synchronized photos, yielding train/validation - images of the same pose at different viewpoints. We show that our method faithfully - reconstructs non-rigidly deforming scenes and reproduces unseen views with high - fidelity. + Extensive quantitative and qualitative experiments demonstrate the + superior visual quality and temporal consistency of our method

From 33810c94e18c13702ed56e7dc093d994c93687e8 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:51:15 +0800 Subject: [PATCH 11/29] Update index.html --- index.html | 3 +++ 1 file changed, 3 insertions(+) diff --git a/index.html b/index.html index 5db61930c..748c16de3 100644 --- a/index.html +++ b/index.html @@ -103,6 +103,9 @@

ShowMaker: Creating High-Fidelity 2D Hu Wenqing Chu3, + + Hang Zhou3, + Zhiqiang Feng3, From 1ad1c3f7f96e4ce1f0e1043a5156606ed37b9849 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:52:38 +0800 Subject: [PATCH 12/29] Update index.html --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 748c16de3..0879de3bf 100644 --- a/index.html +++ b/index.html @@ -93,7 +93,7 @@

ShowMaker: Creating High-Fidelity 2D Hu Quanwei Yang1, - Jianzhi Guan2, + Jiazhi Guan2, Kaisiyuan Wang3*, From db2bae9691f42c2e65b8a9fdeed47cc292dd450a Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 10:59:44 +0800 Subject: [PATCH 13/29] Update index.html --- index.html | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/index.html b/index.html index 0879de3bf..98394ddb8 100644 --- a/index.html +++ b/index.html @@ -457,14 +457,16 @@

Related Links

BibTeX

@article{park2021nerfies,
-  author    = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
-  title     = {Nerfies: Deformable Neural Radiance Fields},
-  journal   = {ICCV},
-  year      = {2021},
+  author    = {Quanwei Yang, Jiazhi Guan, Kaisiyuan Wang, Lingyun Yu, Wenqing Chu, Hang Zhou, Zhiqiang Feng, Haocheng Feng, Errui Ding, Jingdong Wang, Hongtao Xie.},
+  title     = {ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling},
+  journal   = {NeurIPS},
+  year      = {2024},
 }
+ ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling. +
From 65c186ef0683984bc29a768f13fb49b3b6a8fc62 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 11:10:05 +0800 Subject: [PATCH 14/29] Update index.html --- index.html | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/index.html b/index.html index 98394ddb8..a7dd231a7 100644 --- a/index.html +++ b/index.html @@ -273,28 +273,21 @@

Abstract

- Although significant progress has been made in human video generation, most - previous studies focus on either human facial animation or full-body animation, - which cannot be directly applied to produce realistic conversational human videos - with frequent hand gestures and various facial movements simultaneously. + Although significant progress has been made in human video generation, most previous studies focus on either human facial animation or full-body animation, + which cannot be directly applied to produce realistic conversational human videos with frequent hand gestures and various facial movements simultaneously.

- To address these limitations, we propose a 2D human video generation framework, - named ShowMaker, capable of generating high-fidelity half-body conversational - videos based on 2D key points via fine-grained diffusion modeling. We leverage - dual-stream diffusion models as the backbone of our framework and carefully - design two novel components for crucial local regions (i.e., hands and face) that can - be easily integrated into our backbone. Specifically, to handle the challenging hand - generation caused by sparse motion guidance, we propose a novel Key Point-based - Fine-grained Hand Modeling module by amplifying positional information from + To address these limitations, we propose a 2D human video generation framework, named ShowMaker, capable of generating high-fidelity half-body conversational + videos based on 2D key points via fine-grained diffusion modeling. + We leverage dual-stream diffusion models as the backbone of our framework and carefully design two novel components for crucial local regions (i.e., hands and face) that can + be easily integrated into our backbone. + Specifically, to handle the challenging hand generation caused by sparse motion guidance, we propose a novel Key Point-based Fine-grained Hand Modeling module by amplifying positional information from raw hand key points and constructing a corresponding key point-based codebook. - Moreover, to restore richer facial details in generated results, we introduce a Face - Recapture module, which extracts facial texture features and global identity features + Moreover, to restore richer facial details in generated results, we introduce a Face Recapture module, which extracts facial texture features and global identity features from the aligned human face and integrates them into the diffusion process for face enhancement.

- Extensive quantitative and qualitative experiments demonstrate the - superior visual quality and temporal consistency of our method + Extensive quantitative and qualitative experiments demonstrate the superior visual quality and temporal consistency of our method

From ead4c75a7033e6e82bcebc8f3f6af0834152dc50 Mon Sep 17 00:00:00 2001 From: Mumuwei <44329935+Mumuwei@users.noreply.github.com> Date: Thu, 31 Oct 2024 11:13:56 +0800 Subject: [PATCH 15/29] Update index.html --- index.html | 2 -- 1 file changed, 2 deletions(-) diff --git a/index.html b/index.html index a7dd231a7..c62bfcf9b 100644 --- a/index.html +++ b/index.html @@ -458,8 +458,6 @@

BibTeX

- ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling. -