You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
minor fixes to tools prepare_data validators (#47) (#26)
* ensure that only a single whitespace is prepended. Ensure the message regarding the prompt separator is displayed only if a prompt separator exists.
* change pandas contains to not use regex, which can trip if the common_suffix is actually a regex
Co-authored-by: Boris Power <[email protected]>
immediate_msg+=f"\n WARNING: Some of your prompts contain the suffix `{common_suffix}` more than once. We strongly suggest that you review your prompts and add a unique suffix"
error_msg=f"All completions are identical: `{common_suffix}`\nEnsure completions are different, otherwise the model will just repeat `{common_suffix}`"
error_msg=f"All completions are identical: `{common_suffix}`\nEnsure completions are different, otherwise the model will just repeat `{common_suffix}`"
immediate_msg+=f"\n WARNING: Some of your completions contain the suffix `{common_suffix}` more than once. We suggest that you review your completions and add a unique ending"
# Add -v VALID_FILE if we split the file into train / valid
618
630
files_string= ("s"ifsplitelse"") +" to `"+ ("` and `".join(outfnames))
619
631
valid_string=f' -v "{outfnames[1]}"'ifsplitelse""
632
+
separator_reminder= (
633
+
""
634
+
iflen(common_prompt_suffix_new_line_handled) ==0
635
+
elsef"After you’ve fine-tuned a model, remember that your prompt has to end with the indicator string `{common_prompt_suffix_new_line_handled}` for the model to start generating completions, rather than continuing with the prompt."
636
+
)
620
637
sys.stdout.write(
621
-
f'\nWrote modified file{files_string}`\nFeel free to take a look!\n\nNow use that file when fine-tuning:\n> openai api fine_tunes.create -t "{outfnames[0]}"{valid_string}{packing_param}\n\nAfter you’ve fine-tuned a model, remember that your prompt has to end with the indicator string `{common_prompt_suffix_new_line_handled}` for the model to start generating completions, rather than continuing with the prompt.{optional_ending_string}\n'
638
+
f'\nWrote modified file{files_string}`\nFeel free to take a look!\n\nNow use that file when fine-tuning:\n> openai api fine_tunes.create -t "{outfnames[0]}"{valid_string}{packing_param}\n\n{separator_reminder}{optional_ending_string}\n'
622
639
)
623
640
else:
624
641
sys.stdout.write("Aborting... did not write the file\n")
0 commit comments