-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Mac M1/M2 #947
base: main
Are you sure you want to change the base?
Conversation
Was trying to run on my M1 Pro but got the following error: I've tried with Posting the full log bellow for completeness: [2023-12-07 16:52:39,628] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Traceback (most recent call last):
File "example_text_completion.py", line 69, in <module>
fire.Fire(main)
File ".conda/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File ".conda/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File ".conda/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "example_text_completion.py", line 32, in main
generator = Llama.build(
File "llama/generation.py", line 118, in build
checkpoint = torch.load(ckpt_path, map_location="cpu")
File ".conda/lib/python3.10/site-packages/torch/serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File ".conda/lib/python3.10/site-packages/torch/serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
[2023-12-07 16:52:44,653] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 15102) of binary: .conda/bin/python
Traceback (most recent call last):
File ".conda/bin/torchrun", line 8, in <module>
sys.exit(main())
File ".conda/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File ".conda/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File ".conda/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File ".conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File ".conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: |
Seems to work for me! Thanks @dkrantsberg! 2021 MacBook Pro, M1 Pro, 32GB, Sonoma 14.2.1 Python 3.11.7
Example Output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
This works on my machine: Would be helpful to merge this. Thanks @dkrantsberg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M3
Confirmed it's working for my M2 Pro Macbook Pro with Mac OS 14.4.1 (23E224). This should be integrated into the official release as an option for Mac silicon. |
Adds support for Apple Silicon processors by using MPS/CPU instead of CUDA.
Same changes as in the Code Llama PR meta-llama/codellama#18
Tested on M1 Max, macOS 13.4 (Ventura), pytorch 2.1.1