-
University of Washington
- Seattle, WA
- zhang-eh.github.io
- @enhao_zh
Highlights
- Pro
Stars
VOCAL-UDF: Self-Enhancing Video Data Management System for Compositional Events with Large Language Models
m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
A guidance language for controlling large language models.
This is an pytorch implementation of Mask R-CNN on CLEVR dataset.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Official code for VisProg (CVPR 2023 Best Paper!)
Building blocks for foundation models.
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
VideoX: a collection of video cross-modal models
LAVIS - A One-stop Library for Language-Vision Intelligence
EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions
✨✨Latest Advances on Multimodal Large Language Models
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
A SQL and R Synthesizer Using Query Reverse Engineering
Synthesizing SQL queries from input / output examples
MeshInsight: Dissecting Overheads of Service Mesh Sidecars
A modular active learning framework for Python
Repository for "Online Active Model Selection for Pre-trained ML Classifiers"
PyTorch Library for Active Learning to accompany Human-in-the-Loop Machine Learning book
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
(2021) QueryNet: Querying Neural Networks for Lightweight Specialized Models
Video and Image Analytics for Multiple Environments
Nemo debugs Distributed Systems by analyzing provenance graphs obtained during fault injection.
Scene Graph Prediction with Limited Labels