Visual Effects
-- Using nerfies you can create fun visual effects. This Dolly zoom effect - would be impossible without nerfies since it would require going through a wall. -
- -diff --git a/.github/workflows/static.yml b/.github/workflows/static.yml new file mode 100644 index 000000000..f2c9e97c9 --- /dev/null +++ b/.github/workflows/static.yml @@ -0,0 +1,43 @@ +# Simple workflow for deploying static content to GitHub Pages +name: Deploy static content to Pages + +on: + # Runs on pushes targeting the default branch + push: + branches: ["main"] + + # Allows you to run this workflow manually from the Actions tab + workflow_dispatch: + +# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages +permissions: + contents: read + pages: write + id-token: write + +# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. +# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. +concurrency: + group: "pages" + cancel-in-progress: false + +jobs: + # Single deploy job since we're just deploying + deploy: + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + - name: Setup Pages + uses: actions/configure-pages@v5 + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + # Upload entire repository + path: '.' + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v4 diff --git a/index.html b/index.html index 373119fe3..d4bf6a1bf 100644 --- a/index.html +++ b/index.html @@ -3,10 +3,10 @@
+ content="ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling."> -- We present the first method capable of photorealistically reconstructing a non-rigidly - deforming scene using photos/videos captured casually from mobile phones. + Although significant progress has been made in human video generation, most previous studies focus on either human facial animation or full-body animation, + which cannot be directly applied to produce realistic conversational human videos with frequent hand gestures and various facial movements simultaneously.
- Our approach augments neural radiance fields - (NeRF) by optimizing an - additional continuous volumetric deformation field that warps each observed point into a - canonical 5D NeRF. - We observe that these NeRF-like deformation fields are prone to local minima, and - propose a coarse-to-fine optimization method for coordinate-based models that allows for - more robust optimization. - By adapting principles from geometry processing and physical simulation to NeRF-like - models, we propose an elastic regularization of the deformation field that further - improves robustness. + To address these limitations, we propose a 2D human video generation framework, named ShowMaker, capable of generating high-fidelity half-body conversational + videos based on 2D key points via fine-grained diffusion modeling. + We leverage dual-stream diffusion models as the backbone of our framework and carefully design two novel components for crucial local regions (i.e., hands and face) that can + be easily integrated into our backbone. + Specifically, to handle the challenging hand generation caused by sparse motion guidance, we propose a novel Key Point-based Fine-grained Hand Modeling module by amplifying positional information from + raw hand key points and constructing a corresponding key point-based codebook. + Moreover, to restore richer facial details in generated results, we introduce a Face Recapture module, which extracts facial texture features and global identity features + from the aligned human face and integrates them into the diffusion process for face enhancement.
- We show that Nerfies can turn casually captured selfie - photos/videos into deformable NeRF - models that allow for photorealistic renderings of the subject from arbitrary - viewpoints, which we dub "nerfies". We evaluate our method by collecting data - using a - rig with two mobile phones that take time-synchronized photos, yielding train/validation - images of the same pose at different viewpoints. We show that our method faithfully - reconstructs non-rigidly deforming scenes and reproduces unseen views with high - fidelity. + Extensive quantitative and qualitative experiments demonstrate the superior visual quality and temporal consistency of our method
- Using nerfies you can create fun visual effects. This Dolly zoom effect - would be impossible without nerfies since it would require going through a wall. -
- -- As a byproduct of our method, we can also solve the matting problem by ignoring - samples that fall outside of a bounding box during rendering. -
- -- We can also animate the scene by interpolating the deformation latent codes of two input - frames. Use the slider here to linearly interpolate between the left frame and the right - frame. -
-Start Frame
-End Frame
-- Using Nerfies, you can re-render a video from a novel - viewpoint such as a stabilized camera by playing back the training deformations. -
-- There's a lot of excellent work that was introduced around the same time as ours. -
-- Progressive Encoding for Neural Optimization introduces an idea similar to our windowed position encoding for coarse-to-fine optimization. -
-- D-NeRF and NR-NeRF - both use deformation fields to model non-rigid scenes. -
-- Some works model videos with a NeRF by directly modulating the density, such as Video-NeRF, NSFF, and DyNeRF -
-- There are probably many more by the time you are reading this. Check out Frank Dellart's survey on recent NeRF papers, and Yen-Chen Lin's curated list of NeRF papers. -
-@article{park2021nerfies,
- author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
- title = {Nerfies: Deformable Neural Radiance Fields},
- journal = {ICCV},
- year = {2021},
+ @article{yang2024showmaker,
+ author = {Quanwei Yang, Jiazhi Guan, Kaisiyuan Wang, Lingyun Yu, Wenqing Chu, Hang Zhou, Zhiqiang Feng, Haocheng Feng, Errui Ding, Jingdong Wang, Hongtao Xie.},
+ title = {ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling},
+ journal = {NeurIPS},
+ year = {2024},
}
- This means you are free to borrow the source code of this website, we just ask that you link back to this page in the footer. Please remember to remove the analytics code included in the header of the website which - you do not want on your website. + you do not want on your website. -->