Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use config flag for locality-aware mpi #941

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions src/parcsr_ls/par_cycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,15 @@ hypre_BoomerAMGCycle( void *amg_vdata,
hypre_GpuProfilingPushRange(nvtx_name);
while (Not_Finished)
{
#ifdef HYPRE_USING_NODE_AWARE_MPI
if (level >= hypre_HandleNodeAwareSwitchoverThreshold(hypre_handle()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liruipeng This switch threshold could be moved inside the AMG struct

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I think it could be in the AMG struct as long as it can still be set when parsing command line args, where the value comes from

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. We have the access to hypre_ParAMGData in this function. Do we need hypre_HandleNodeAwareSwitchoverThreshold in other functions that don't have the access?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only see it being used at par_cycle.c

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I didn't check these in as permanent changes to the ij driver but to properly use node aware, you must set this hypre_HandleNodeAwareSwitchoverThreshold otherwise it will unoptimally use node aware mpi for every amg level.

This brings up 2 things:

  1. NodeAwareSwitchoverThreshold must be accessible to set like from the ij.c driver. It's quite nice to have it as a command line arg to test different thresholds without recompiling or anything.

  2. We probably want to document this and how to connect to MPI advance. I can compile the steps and maybe we could include it somewhere in the wiki or something? But I don't know how to contribute to the wiki.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, there should be a Set routine for that, e.g., HYPRE_BoomerAMGSetNodeAwareSwitchLevel
  2. Great idea, I'm not sure about the access but if you prepare something in markdown format, we can add it to the wiki.

Copy link
Contributor

@liruipeng liruipeng Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting it in the Wiki is a good idea.

{
hypre_HandleUsingNodeAwareMPI(hypre_handle()) = 1;
Copy link
Contributor

@victorapm victorapm Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liruipeng however, this other global variable (which turns off/on node aware MPI) cannot. One idea is to move it to the comm_pkg struct.

PS: we could have it defaulted to zero, and implement something like hypre_CommPkgSetNodeAwareMPI to turn comm_pkg->node_aware_mpi on/off

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to do that, comm_pkg would need to be accessible within hypre_BoomerAMGCycle which I don't think it is unless I'm mistaken. I don't see it in hypre_ParAMGData. Maybe it could be added as a field there but would need to be properly set and maintained.

Copy link
Contributor

@victorapm victorapm Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use

#define hypre_ParCSRMatrixCommPkg(matrix) ((matrix) -> comm_pkg)

e.g., hypre_ParCSRMatrixCommPkg(A_array[lvl])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong...but to what granularity do we want to control hypre_HandleUsingNodeAwareMPI? If we put in commPkg, it would be per commPkg/Parcsrmatrix. I thought it's a more global thing, like the whole AMG solver. Again, putting it into the AMG struct is ideal, if we can find a clean way to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, the node aware option has been implemented only for the IJ interface. If we want it on the other interfaces in the future, we could extend their respective comm_pkg structs to handle this option.

Note I'm suggesting to put using_node_aware_mpi under the comm_pkg struct here, i.e., not under the AMG struct as for the other case :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may still work but it's my understanding that we have multiple matrices per level that each have a comm pkg. The using_node_aware toggle logically applies to a whole level with multiple matrices. So I think we would need to get the comm pkg for A and P and set them both to use node aware in their comm pkgs but I think that would work. What do y'all think?

Copy link
Contributor

@victorapm victorapm Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @geraldc-unm! I spoke to Rui Peng and shared that's a better approach in my opinion. Let's think about this (moving the variable to inside comm_pkg) a bit more...

} else
{
hypre_HandleUsingNodeAwareMPI(hypre_handle()) = 0;
}
#endif
if (num_levels > 1)
{
local_size = hypre_VectorSize(hypre_ParVectorLocalVector(F_array[level]));
Expand Down
16 changes: 16 additions & 0 deletions src/parcsr_mv/_hypre_parcsr_mv.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ extern "C" {
#ifndef HYPRE_PAR_CSR_COMMUNICATION_HEADER
#define HYPRE_PAR_CSR_COMMUNICATION_HEADER

#ifdef HYPRE_USING_NODE_AWARE_MPI
#include "mpi_advance.h"
#endif

/*--------------------------------------------------------------------------
* hypre_ParCSRCommPkg:
* Structure containing information for doing communications
Expand Down Expand Up @@ -59,13 +63,20 @@ typedef struct
void *recv_data_buffer;
HYPRE_Int num_requests;
hypre_MPI_Request *requests;
#ifdef HYPRE_USING_NODE_AWARE_MPI
MPIX_Request *Xrequest;
#endif
} hypre_ParCSRCommHandle;

typedef hypre_ParCSRCommHandle hypre_ParCSRPersistentCommHandle;

typedef struct _hypre_ParCSRCommPkg
{
MPI_Comm comm;
#ifdef HYPRE_USING_NODE_AWARE_MPI
MPIX_Comm *neighbor_comm;
MPIX_Comm *neighborT_comm;
#endif
HYPRE_Int num_components;
HYPRE_Int num_sends;
HYPRE_Int *send_procs;
Expand All @@ -75,6 +86,11 @@ typedef struct _hypre_ParCSRCommPkg
HYPRE_Int num_recvs;
HYPRE_Int *recv_procs;
HYPRE_Int *recv_vec_starts;
HYPRE_Int use_neighbor;
#ifdef HYPRE_USING_NODE_AWARE_MPI
long *global_send_indices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to use hypre_uint64, see

typedef uint64_t hypre_uint64;
or hypre_longint here

long *global_recv_indices;
#endif
/* remote communication information */
hypre_MPI_Datatype *send_mpi_types;
hypre_MPI_Datatype *recv_mpi_types;
Expand Down
40 changes: 40 additions & 0 deletions src/parcsr_mv/new_commpkg.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
*-----------------------------------------------------*/

#include "_hypre_parcsr_mv.h"
#ifdef HYPRE_USING_NODE_AWARE_MPI
#include <mpi.h>
#endif

/* some debugging tools*/
#define mydebug 0
Expand Down Expand Up @@ -546,6 +549,43 @@ hypre_ParCSRCommPkgCreateApart
num_sends, send_procs, send_map_starts,
send_map_elmts,
&comm_pkg);
#ifdef HYPRE_USING_NODE_AWARE_MPI
if (comm_pkg->use_neighbor) {
HYPRE_Int *sendcounts = hypre_TAlloc(HYPRE_Int, num_sends, HYPRE_MEMORY_HOST);
HYPRE_Int *recvcounts = hypre_TAlloc(HYPRE_Int, num_recvs, HYPRE_MEMORY_HOST);
for (HYPRE_Int i = 0; i < num_sends; i++) {
sendcounts[i] = send_map_starts[i+1] - send_map_starts[i];
}
for (HYPRE_Int i = 0; i < num_recvs; i++) {
recvcounts[i] = recv_vec_starts[i+1] - recv_vec_starts[i];
}
MPIX_Dist_graph_create_adjacent( comm, num_recvs, hypre_ParCSRCommPkgRecvProcs(comm_pkg),
recvcounts,
num_sends, hypre_ParCSRCommPkgSendProcs(comm_pkg),
sendcounts,
MPI_INFO_NULL, 0, &(comm_pkg->neighbor_comm));
MPIX_Dist_graph_create_adjacent( comm, num_sends, hypre_ParCSRCommPkgSendProcs(comm_pkg),
sendcounts,
num_recvs, hypre_ParCSRCommPkgRecvProcs(comm_pkg),
recvcounts,
MPI_INFO_NULL, 0, &(comm_pkg->neighborT_comm));

HYPRE_Int num_send_elmts = send_map_starts[num_sends];
comm_pkg->global_send_indices = hypre_CTAlloc(long, num_send_elmts, HYPRE_MEMORY_HOST);
for (int i = 0; i < num_sends; i++) {
for (int j = send_map_starts[i]; j < send_map_starts[i+1]; j++) {
comm_pkg->global_send_indices[j] = send_map_elmts[j] + first_col_diag;
}
}
HYPRE_Int num_recv_elmts = recv_vec_starts[num_recvs];
comm_pkg->global_recv_indices = hypre_CTAlloc(long, num_recv_elmts, HYPRE_MEMORY_HOST);
for (int i = 0; i < num_recvs; i++) {
for (int j = recv_vec_starts[i]; j < recv_vec_starts[i+1]; j++) {
comm_pkg->global_recv_indices[j] = col_map_off_d[j];
}
}
}
#endif

return hypre_error_flag;
}
Expand Down
Loading