Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Is there any related architecture design or documentation for embedding collection #444

Closed
Jiaao-Bai opened this issue Mar 7, 2024 · 2 comments
Labels
question Further information is requested

Comments

@Jiaao-Bai
Copy link

Jiaao-Bai commented Mar 7, 2024

I would like to learn how embedding collection fuses lookups and poolings from different groups into a single kernel while supporting different vector sizes.

@Jiaao-Bai Jiaao-Bai added the question Further information is requested label Mar 7, 2024
@Jiaao-Bai Jiaao-Bai changed the title [Question] embedding collection有相关的架构设计或文档吗 [Question] Is there any related architecture design or documentation for embedding collection Mar 7, 2024
@shijieliu
Copy link
Collaborator

hi @Jiaao-Bai thanks for trying out hugectr.

About your question, there is currently no public doc related to our design. The embedding collection is constructed by several components and I would like to provide some guides to help you understand it.

  1. Embedding: Embedding is where we do forward and backward for embedding. Different sharding of embedding has different implementation like data parallel, model parallel. The source code is under embedding.
  2. Embedding Storage: Embedding Storage is how we store the embedding vectors. We provide static, which the number of embedding vectors in the table can not be changed after initialization, and dynamic embedding storage. You can find their implementation under here
  3. Data distributor: This is related to how we convert the data parallel embedding input, keys, into model parallel format that can be consumed by the Embedding. The code is here.
  4. Embedding Operator: Basic operators that we use in the Embedding. We specify the stage of embedding in the forward as model_forward, all2all, network_forward and provide corresponding seperate operators. The same mechinism works for backward as network_backward, all2all, local_reduce. You can find the code here
  5. Generic Lookup: This is a template kernel we used to generate different settings of kernels used in embedding. The code is here

For your question, I think you can refer 1, 2, 4 and 5. I would suggest you start from data parallel embedding since it's easier to understand. Thanks!

@Jiaao-Bai
Copy link
Author

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants