Refactor checkpoint conversion script for improved readability and efficiency #633

tdas3001 · 2025-02-10T23:44:44Z

Replaced os.path.join with Pathlib for improved readability and efficiency
Used Path.glob() instead of glob(os.path.join(...)) for file operations
Simplified directory creation with Path.mkdir(parents=True, exist_ok=True)
Refactored parameter splitting logic to remove unnecessary loops and redundant conditions
Used direct indexing for expert assignment instead of iterating over all model parallel indices
Replaced .narrow() with Python slicing for improved readability
Added an assertion to ensure mp > 0 before proceeding
Reorganized key mapping logic for better readability and maintainability
Optimized name modifications using a single chained .replace() operation
Ensured that only necessary operations are performed when processing model parameters

…ficiency

ruchirmehta07 · 2025-02-13T15:26:40Z

inference/convert.py

+                if "experts" in name and "shared_experts" not in name:
+                    idx = int(name.split(".")[-3])
+                    target_index = idx // n_local_experts
+                    if target_index < mp:
+                        state_dicts[target_index][name] = param
+                elif dim is not None:
+                    assert param.size(dim) % mp == 0, f"Dimension {dim} must be divisible by {mp}"
+                    shard_size = param.size(dim) // mp
+                    for i in range(mp):
+                        state_dicts[i][name] = param[:, i * shard_size : (i + 1) * shard_size] if dim == 1 else param[i * shard_size : (i + 1) * shard_size]


We can add comments here for better clarity.

ruchirmehta07 · 2025-02-13T15:27:30Z

inference/convert.py

@@ -43,46 +44,55 @@ def main(hf_ckpt_path, save_path, n_experts, mp):
    Returns:
        None
    """
+    assert mp > 0, "Model parallelism (mp) must be greater than 0"
+
    torch.set_num_threads(8)
    n_local_experts = n_experts // mp
    state_dicts = [{} for _ in range(mp)]


Add a comment to define state_dicts variable

Refactor checkpoint conversion script for improved readability and ef…

8972914

…ficiency

ruchirmehta07 suggested changes Feb 13, 2025

View reviewed changes

Programmer-RD-AI approved these changes Feb 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor checkpoint conversion script for improved readability and efficiency #633

Refactor checkpoint conversion script for improved readability and efficiency #633

tdas3001 commented Feb 10, 2025

ruchirmehta07 Feb 13, 2025

ruchirmehta07 Feb 13, 2025

Refactor checkpoint conversion script for improved readability and efficiency #633

Are you sure you want to change the base?

Refactor checkpoint conversion script for improved readability and efficiency #633

Conversation

tdas3001 commented Feb 10, 2025

ruchirmehta07 Feb 13, 2025

Choose a reason for hiding this comment

ruchirmehta07 Feb 13, 2025

Choose a reason for hiding this comment