Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made framework changes to initialize specific cache block sizes for TRSM #570

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Oct 28, 2021

  1. Made framework changes to initialize specific cache block sizes for T…

    …RSM.
    
    Details:
    -This commit addresses the performance optimization(single-thread and
     multi-thread) for DTRSM on zen2.
    -This new optimization employs different MC, KC & NC values for TRSM than
     what is being used in other Level-3 routines like DGEMM.
    -Changed TRSM framework code to choose these blocksizes for TRSM
     on zen family configurations.
    -Added a new field called "trsm_blkszs" to cntx structure in order to
     store TRSM specific block sizes.
    -Implemented routines to initialize, set and query the TRSM-specific
     block sizes.
    -Defined a new macro "AOCL_BLIS_ZEN" in configure script.
     This macro is automatically defined for zen family architectures.
     It enables us to choose different cache block sizes for TRSM instead of common level-3 block sizes.
    
    Change-Id: Id8557b1c962a316b1edecca9cd582675eaf35fe6
    Signed-off-by: Meghana Vankadari <[email protected]>
    AMD-Internal: [CPUPL-656]
    Meghana-vankadari committed Oct 28, 2021
    Configuration menu
    Copy the full SHA
    2643db0 View commit details
    Browse the repository at this point in the history