You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to learn how embedding collection fuses lookups and poolings from different groups into a single kernel while supporting different vector sizes.
The text was updated successfully, but these errors were encountered:
Jiaao-Bai
changed the title
[Question] embedding collection有相关的架构设计或文档吗
[Question] Is there any related architecture design or documentation for embedding collection
Mar 7, 2024
About your question, there is currently no public doc related to our design. The embedding collection is constructed by several components and I would like to provide some guides to help you understand it.
Embedding: Embedding is where we do forward and backward for embedding. Different sharding of embedding has different implementation like data parallel, model parallel. The source code is under embedding.
Embedding Storage: Embedding Storage is how we store the embedding vectors. We provide static, which the number of embedding vectors in the table can not be changed after initialization, and dynamic embedding storage. You can find their implementation under here
Data distributor: This is related to how we convert the data parallel embedding input, keys, into model parallel format that can be consumed by the Embedding. The code is here.
Embedding Operator: Basic operators that we use in the Embedding. We specify the stage of embedding in the forward as model_forward, all2all, network_forward and provide corresponding seperate operators. The same mechinism works for backward as network_backward, all2all, local_reduce. You can find the code here
Generic Lookup: This is a template kernel we used to generate different settings of kernels used in embedding. The code is here
For your question, I think you can refer 1, 2, 4 and 5. I would suggest you start from data parallel embedding since it's easier to understand. Thanks!
I would like to learn how embedding collection fuses lookups and poolings from different groups into a single kernel while supporting different vector sizes.
The text was updated successfully, but these errors were encountered: