Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

Cm3 integration#727

Open
urielsinger wants to merge 25 commits into
mainfrom
cm3_seq_len
Open

Cm3 integration#727
urielsinger wants to merge 25 commits into
mainfrom
cm3_seq_len

Conversation

@urielsinger

Copy link
Copy Markdown
Contributor

No description provided.

@ArmenAg

ArmenAg commented Jun 6, 2023

Copy link
Copy Markdown
Contributor

This has some specific logic to CM3leon project (i.e., img token conversions), are we sure we want to land this in main? Do we want to possible pick a branch and merge everything into there and periodically update to main?

Comment thread metaseq/cli/train.py
process_group=distributed_utils.get_data_parallel_group(),
)
model = task.build_model(cfg.model)
if not isinstance(model, FullyShardedDataParallel):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, this is for loading up consolidated model for training?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
I added support to change the MP size during job lunch, and for that I need to wrap it in FullyShardedDataParallel inside the build_model.
As I don't want to double wrap it, I needed to add this if..

@@ -4,6 +4,7 @@
# LICENSE file in the root directory of this source tree.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are changes to the cm3 objectives that i landed in scaling_racm3 correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

@ArmenAg

ArmenAg commented Jun 6, 2023

Copy link
Copy Markdown
Contributor

Code LGTM. @suchenzang to give guidance on whether or not to land here.

def _create_cm3_special_tokens(self):
self.cm3_sentinel_end = "<eoss>"
self.cm3_break = "<racm3:break>"
self.dictionary.add_symbol(self.cm3_break)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all looking great.
We want to make a change here to recycle unused embedding index of cm3_break and sentinel for the next version.
Should i just add a commit on top of this PR? Or do I file a different PR?

@@ -200,24 +209,31 @@ def get_document_boundaries(self, item: torch.Tensor):
boundaries = boundaries + [item.size(0)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is get_document_boundaries() robust to the case that there is no break tokens

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants