Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2025.03.0 RC #305

Merged
merged 25 commits into from
Mar 19, 2025
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
73af76a
Merge pull request #274 from LLNL/main
adayton1 Jul 26, 2024
09fe8af
Fix syntax error in raja-chai integration test (#276)
liu15 Sep 10, 2024
6e1388b
Tabale reproducer (#281)
liu15 Sep 18, 2024
e083a79
Fix config variable in release notes (#282)
adayton1 Sep 18, 2024
ec3fe01
Update toolchains on tioga, lassen, ruby and poodle (#275)
adrienbernede Sep 24, 2024
a1be998
Update and sandardize implementation of packages, in sync with spack …
adrienbernede Oct 14, 2024
ff456cf
Remove deprecated getPointer method (#284)
adayton1 Oct 29, 2024
7ba2ba8
Remove ManagedArray implicit casts (#285)
adayton1 Oct 29, 2024
c12475b
Remove make_managed_from_factory (#291)
adayton1 Dec 5, 2024
fc20f3f
fix reallocation for pinned memory. (#292)
robinson96 Dec 9, 2024
de7efab
Remove incr and decr methods on ManagedArray (#295)
adayton1 Jan 27, 2025
c43103c
Always enable pick and set (#296)
adayton1 Jan 28, 2025
799ce36
Document when XNACK is required (#297)
adayton1 Feb 4, 2025
6588a53
Squash amdclang warnings. (#299)
mdavis36 Feb 5, 2025
660d094
Update copyright to 2025 (#300)
adayton1 Feb 5, 2025
91ec504
Support allocators in thin mode (#301)
adayton1 Feb 7, 2025
a3c1915
Remove ManagedArray::getActiveBasePointer (#302)
adayton1 Feb 11, 2025
a989bd2
Various updates in RADIUSS Spack Configs (#290)
adrienbernede Feb 13, 2025
214d6ff
Restore getActiveBasePointer (#303)
adayton1 Mar 11, 2025
ce37124
Update release number
adayton1 Mar 11, 2025
30c86b1
Update to BLT v0.7.0
adayton1 Mar 11, 2025
58c1331
Update to RAJA v2025.03.0-RC
adayton1 Mar 18, 2025
453488c
Update to Umpire v2025.03.0
adayton1 Mar 18, 2025
d75a084
Update release date
adayton1 Mar 18, 2025
a433d27
Update release date
adayton1 Mar 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Remove make_managed_from_factory (#291)
* Remove make_managed_from_factory
adayton1 authored Dec 5, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit c12475b46366507e5e68b0af457ba7023aa82d53
3 changes: 2 additions & 1 deletion RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -18,7 +18,8 @@ The format of this file is based on [Keep a Changelog](http://keepachangelog.com
### Removed
- Removes deprecated ManagedArray::getPointer method. Use ManagedArray::data instead.
- Removes optional support for implicitly casting between raw pointers and ManagedArrays (CHAI\_ENABLE\_IMPLICIT\_CONVERSIONS). Use makeManagedArray and ManagedArray::data to perform explicit conversions instead.
- Removes equality and inequality comparison operators between ManagedArrays and raw pointers
- Removes equality and inequality comparison operators between ManagedArrays and raw pointers.
- Removes make\_managed\_from\_factory function for creating managed\_ptr objects from factory functions. This change will lead to safer adoption of allocators during construction and destruction of managed\_ptr objects.

## [Version 2024.07.0] - Release date 2024-07-26

218 changes: 25 additions & 193 deletions src/chai/managed_ptr.hpp
Original file line number Diff line number Diff line change
@@ -57,10 +57,10 @@ namespace chai {
///
/// This wrapper stores both host and device pointers so that polymorphism can be
/// used in both contexts with a single API.
/// The make_managed and make_managed_from_factory functions call new on both the
/// host and device so that polymorphism is valid in both contexts. Simply copying
/// the bits of an object to the device will not copy the vtable, so new must be
/// called on the device.
/// The make_managed function calls new on both the host and device so that
/// polymorphism is valid in both contexts. Simply copying the bits of an
/// object to the device will not copy the vtable, so new must be called
/// on the device.
///
/// Usage Requirements:
/// Methods that can be called on the host and/or device must be declared
@@ -72,28 +72,27 @@ namespace chai {
/// you must explicitly modify the object in both the host context and the
/// device context.
/// C-style array members of T need to be initialized correctly with a host or
/// device C-style array. If a ManagedArray is passed to the make_managed or
/// make_managed_from_factory methods in place of a C-style array, wrap it in
/// a call to chai::unpack to extract the C-style arrays contained within the
/// ManagedArray. This will pass the extracted host C-style array to the host
/// constructor and the extracted device C-style array to the device
/// constructor. If it is desired that these host and device C-style arrays be
/// kept in sync like the normal behavior of ManagedArray, define a callback
/// that maintains a copy of the ManagedArray and upon the ACTION_MOVE event
/// calls the copy constructor of that ManagedArray.
/// device C-style array. If a ManagedArray is passed to the make_managed
/// function in place of a C-style array, wrap it in a call to chai::unpack to
/// extract the C-style arrays contained within the ManagedArray. This will
/// pass the extracted host C-style array to the host constructor and the
/// extracted device C-style array to the device constructor. If it is desired
/// that these host and device C-style arrays be kept in sync like the normal
/// behavior of ManagedArray, define a callback that maintains a copy of the
/// ManagedArray and upon the ACTION_MOVE event calls the copy constructor of
/// that ManagedArray.
/// If a C-style array is passed to make_managed, accessing that member will be
/// valid only in the correct context. To prevent the accidental use of that
/// member in the wrong context, any methods that access it should be __host__
/// only or __device__ only. Special care should be taken when passing C-style
/// arrays as arguments to member functions.
/// The same restrictions for C-style array members also apply to raw pointer
/// members. If a managed_ptr is passed to the make_managed or
/// make_managed_from_factory methods in place of a raw pointer, wrap it in
/// a call to chai::unpack to extract the raw pointers contained within the
/// managed_ptr. This will pass the extracted host pointer to the host
/// constructor and the extracted device pointer to the device constructor.
/// If it is desired that these host and device pointers be kept in sync,
/// define a callback that maintains a copy of the managed_ptr and upon the
/// members. If a managed_ptr is passed to the make_managed function in place of
/// a raw pointer, wrap it in a call to chai::unpack to extract the raw pointers
/// contained within the managed_ptr. This will pass the extracted host pointer
/// to the host constructor and the extracted device pointer to the device
/// constructor. If it is desired that these host and device pointers be kept in
/// sync, define a callback that maintains a copy of the managed_ptr and upon the
/// ACTION_MOVE event call the copy constructor of that managed_ptr.
/// Again, if a raw pointer is passed to make_managed, accessing that member will
/// only be valid in the correct context. Take care when passing raw pointers
@@ -102,12 +101,12 @@ namespace chai {
/// turn off GPU error checking, pass -DCHAI_ENABLE_GPU_ERROR_CHECKING=OFF as
/// an argument to cmake when building CHAI. To turn on synchronization after
/// every kernel, set the appropriate environment variable (e.g. CUDA_LAUNCH_BLOCKING or HIP_LAUNCH_BLOCKING).
/// Alternatively, call cudaDeviceSynchronize() after any call to make_managed,
/// make_managed_from_factory, or managed_ptr::free, and check the return code
/// for errors. If your code crashes in the constructor/destructor of T, then
/// it is recommended to turn on this synchronization. For example, the
/// constructor of T might run out of per-thread stack space on the GPU. If
/// that happens, you can increase the device limit of per-thread stack space.
/// Alternatively, call cudaDeviceSynchronize() after any call to make_managed
/// or managed_ptr::free, and check the return code for errors. If your code
/// crashes in the constructor/destructor of T, then it is recommended to turn
/// on this synchronization for debugging. For example, the constructor of T
/// might run out of per-thread stack space on the GPU. If that happens, you
/// can increase the device limit of per-thread stack space.
///
template <typename T>
class managed_ptr {
@@ -804,27 +803,6 @@ namespace chai {
*gpuPointer = new T(processArguments(args)...);
}

///
/// @author Alan Dayton
///
/// Creates a new object on the device by calling the given factory method.
///
/// @param[out] gpuPointer Used to return the device pointer to the new object
/// @param[in] f The factory method (must be a __device__ or __host__ __device__
/// method
/// @param[in] args The arguments to the factory method
///
/// @note Cannot capture argument packs in an extended device lambda,
/// so explicit kernel is needed.
///
template <typename T,
typename F,
typename... Args>
CHAI_GLOBAL void make_on_device_from_factory(T** gpuPointer, F f, Args... args)
{
*gpuPointer = f(processArguments(args)...);
}

///
/// @author Alan Dayton
///
@@ -933,44 +911,6 @@ namespace chai {
return cpuPointer;
}

///
/// @author Alan Dayton
///
/// Calls a factory method to create a new object on the host.
/// Sets the execution space to the CPU so that ManagedArrays and managed_ptrs
/// are moved to the host as necessary.
///
/// @param[in] f The factory method
/// @param[in] args The arguments to the factory method
///
/// @return The host pointer to the new object
///
template <typename T,
typename F,
typename... Args>
CHAI_HOST T* make_on_host_from_factory(F f, Args&&... args) {
#if !defined(CHAI_DISABLE_RM)
// Get the ArrayManager and save the current execution space
chai::ArrayManager* arrayManager = chai::ArrayManager::getInstance();
ExecutionSpace currentSpace = arrayManager->getExecutionSpace();

// Set the execution space so that ManagedArrays and managed_ptrs
// are handled properly
arrayManager->setExecutionSpace(CPU);
#endif

// Create the object on the device
T* cpuPointer = f(args...);

#if !defined(CHAI_DISABLE_RM)
// Set the execution space back to the previous value
arrayManager->setExecutionSpace(currentSpace);
#endif

// Return the GPU pointer
return cpuPointer;
}

///
/// @author Alan Dayton
///
@@ -1035,67 +975,6 @@ namespace chai {
free(cpuBuffer);
gpuFree(gpuBuffer);

#if !defined(CHAI_DISABLE_RM)
// Set the execution space back to the previous value
arrayManager->setExecutionSpace(currentSpace);
#endif

// Return the GPU pointer
return gpuPointer;
}

///
/// @author Alan Dayton
///
/// Calls a factory method to create a new object on the device.
///
/// @param[in] f The factory method
/// @param[in] args The arguments to the factory method
///
/// @return The device pointer to the new object
///
template <typename T,
typename F,
typename... Args>
CHAI_HOST T* make_on_device_from_factory(F f, Args&&... args) {
#if !defined(CHAI_DISABLE_RM)
// Get the ArrayManager and save the current execution space
chai::ArrayManager* arrayManager = chai::ArrayManager::getInstance();
ExecutionSpace currentSpace = arrayManager->getExecutionSpace();
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
arrayManager->setGPUSimMode(true);
#endif

// Set the execution space so that chai::ManagedArrays and
// chai::managed_ptrs are handled properly
arrayManager->setExecutionSpace(GPU);
#endif

// Allocate space on the GPU to hold the pointer to the new object
T** gpuBuffer;
gpuMalloc((void**)(&gpuBuffer), sizeof(T*));

// Create the object on the device
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
detail::make_on_device_from_factory(gpuBuffer, f, args...);
arrayManager->setGPUSimMode(false);
#elif defined(__CUDACC__) && defined(CHAI_ENABLE_MANAGED_PTR_ON_GPU)
detail::make_on_device_from_factory<T><<<1, 1>>>(gpuBuffer, f, args...);
#elif defined(__HIPCC__) && defined(CHAI_ENABLE_MANAGED_PTR_ON_GPU)
hipLaunchKernelGGL(detail::make_on_device_from_factory, 1, 1, 0, 0, gpuBuffer, f, args...);
#endif

// Allocate space on the CPU for the pointer and copy the pointer to the CPU
T** cpuBuffer = (T**) malloc(sizeof(T*));
gpuMemcpy(cpuBuffer, gpuBuffer, sizeof(T*), gpuMemcpyDeviceToHost);

// Get the GPU pointer
T* gpuPointer = cpuBuffer[0];

// Free the host and device buffers
free(cpuBuffer);
gpuFree(gpuBuffer);

#if !defined(CHAI_DISABLE_RM)
// Set the execution space back to the previous value
arrayManager->setExecutionSpace(currentSpace);
@@ -1155,53 +1034,6 @@ namespace chai {
#endif
}

///
/// @author Alan Dayton
///
/// Makes a managed_ptr<T>.
/// Factory function to create managed_ptrs.
///
/// @param[in] f The factory function that will create the object
/// @param[in] args The arguments to the factory function
///
template <typename T,
typename F,
typename... Args>
CHAI_HOST managed_ptr<T> make_managed_from_factory(F&& f, Args&&... args) {
static_assert(detail::is_invocable<F, Args...>::value,
"F is not invocable with the given arguments.");

static_assert(std::is_pointer<typename std::result_of<F(Args...)>::type>::value,
"F does not return a pointer.");

using R = typename std::remove_pointer<typename std::result_of<F(Args...)>::type>::type;

static_assert(std::is_convertible<R*, T*>::value,
"F does not return a pointer that is convertible to T*.");

#if (defined(CHAI_GPUCC) || defined(CHAI_ENABLE_GPU_SIMULATION_MODE)) && defined(CHAI_ENABLE_MANAGED_PTR_ON_GPU)
// Construct on the GPU first to take advantage of asynchrony
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
chai::ArrayManager* threadRM = chai::ArrayManager::getInstance();
threadRM->setGPUSimMode(true);
#endif
T* gpuPointer = make_on_device_from_factory<R>(f, args...);
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
threadRM->setGPUSimMode(false);
#endif
#endif

// Construct on the CPU
T* cpuPointer = make_on_host_from_factory<R>(f, args...);

// Construct and return the managed_ptr
#if defined(CHAI_GPUCC) && defined(CHAI_ENABLE_MANAGED_PTR_ON_GPU)
return managed_ptr<T>({CPU, GPU}, {cpuPointer, gpuPointer});
#else
return managed_ptr<T>({CPU}, {cpuPointer});
#endif
}

///
/// @author Alan Dayton
///
153 changes: 1 addition & 152 deletions tests/unit/managed_ptr_unit_tests.cpp
Original file line number Diff line number Diff line change
@@ -53,9 +53,8 @@ class Simple {
class TestBase {
public:
CHAI_HOST_DEVICE TestBase() {}
CHAI_HOST_DEVICE virtual ~TestBase() {}

CHAI_HOST_DEVICE static TestBase* Factory(const int value);
CHAI_HOST_DEVICE virtual ~TestBase() {}

CHAI_HOST_DEVICE virtual int getValue() const = 0;
};
@@ -73,23 +72,6 @@ class TestDerived : public TestBase {
int m_value;
};

CHAI_HOST_DEVICE TestBase* TestBase::Factory(const int value) {
return new TestDerived(value);
}

CHAI_HOST_DEVICE TestBase* Factory(const int value) {
return new TestDerived(value);
}

CHAI_HOST_DEVICE TestBase* OverloadedFactory() {
return new TestDerived(-1);
}

CHAI_HOST_DEVICE TestBase* OverloadedFactory(const int value) {
return new TestDerived(value);
}


TEST(managed_ptr, default_constructor)
{
chai::managed_ptr<TestDerived> derived;
@@ -766,94 +748,6 @@ GPU_TEST(managed_ptr, gpu_make_managed)
derived.free();
}

GPU_TEST(managed_ptr, make_managed_from_factory_function)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST_DEVICE (const int value) {
return Factory(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(factory, expectedValue);

EXPECT_EQ((*derived).getValue(), expectedValue);

EXPECT_NE(derived.get(), nullptr);
EXPECT_TRUE(derived);
EXPECT_FALSE(derived == nullptr);
EXPECT_FALSE(nullptr == derived);
EXPECT_TRUE(derived != nullptr);
EXPECT_TRUE(nullptr != derived);

derived.free();
}

GPU_TEST(managed_ptr, make_managed_from_factory_lambda)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST_DEVICE (const int value) {
return new TestDerived(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(factory, expectedValue);

EXPECT_EQ((*derived).getValue(), expectedValue);

EXPECT_NE(derived.get(), nullptr);
EXPECT_TRUE(derived);
EXPECT_FALSE(derived == nullptr);
EXPECT_FALSE(nullptr == derived);
EXPECT_TRUE(derived != nullptr);
EXPECT_TRUE(nullptr != derived);

derived.free();
}

GPU_TEST(managed_ptr, make_managed_from_overloaded_factory_function)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST_DEVICE (const int value) {
return OverloadedFactory(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(factory, expectedValue);

EXPECT_EQ((*derived).getValue(), expectedValue);

EXPECT_NE(derived.get(), nullptr);
EXPECT_TRUE(derived);
EXPECT_FALSE(derived == nullptr);
EXPECT_FALSE(nullptr == derived);
EXPECT_TRUE(derived != nullptr);
EXPECT_TRUE(nullptr != derived);

derived.free();
}

GPU_TEST(managed_ptr, make_managed_from_factory_static_member_function)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST_DEVICE (const int value) {
return TestBase::Factory(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(factory, expectedValue);

EXPECT_EQ((*derived).getValue(), expectedValue);

EXPECT_NE(derived.get(), nullptr);
EXPECT_TRUE(derived);
EXPECT_FALSE(derived == nullptr);
EXPECT_FALSE(nullptr == derived);
EXPECT_TRUE(derived != nullptr);
EXPECT_TRUE(nullptr != derived);

derived.free();
}

GPU_TEST(managed_ptr, gpu_copy_constructor)
{
const int expectedValue = rand();
@@ -1048,48 +942,3 @@ GPU_TEST(managed_ptr, gpu_copy_assignment_operator)
}

#endif

// Enable the following tests to ensure that proper compiler errors are given
// for bad arguments since otherwise it is difficult to make sure the template
// metaprogramming is correct.

#if 0

// Should give something like the following:
// error: static assertion failed: F is not invocable with the given arguments.

TEST(managed_ptr, bad_function_to_make_managed_from_factory_function)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST (const int value) {
return new TestDerived(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(expectedValue, factory);

EXPECT_EQ((*derived).getValue(), expectedValue);
}

#endif

#if 0

// Should give something like the following:
// error: static assertion failed: F is not invocable with the given arguments.

TEST(managed_ptr, bad_arguments_to_make_managed_from_factory_function)
{
const int expectedValue = rand();

auto factory = [] CHAI_HOST (const int value) {
return new TestDerived(value);
};

auto derived = chai::make_managed_from_factory<TestBase>(factory, expectedValue, 3);

EXPECT_EQ((*derived).getValue(), expectedValue);
}

#endif