Skip to content

Add attribution metadata policy for BIG_DATA#610

Open
Yaswant Pradhan (yaswant) wants to merge 15 commits into
MetOffice:mainfrom
yaswant:big-data-policy
Open

Add attribution metadata policy for BIG_DATA#610
Yaswant Pradhan (yaswant) wants to merge 15 commits into
MetOffice:mainfrom
yaswant:big-data-policy

Conversation

@yaswant
Copy link
Copy Markdown
Contributor

@yaswant Yaswant Pradhan (yaswant) commented Mar 31, 2026

PR Summary

IAO Approver: Ben Fitzpatrick (@benfitzpatrick)
Code Reviewer: Sam Clarke-Green (@t00sa)

Cc: Andrew Clark (@arjclark)

Adding attribution metadata policy to working practices.

Code Quality Checklist

IAO Comments

Code Review

  • The changes are coherent and valid

@t00sa
Copy link
Copy Markdown
Collaborator

This is a good start, but there are no rules for developers. Without clear guidance on what sort of data can be added and what the supporting evidence is required, this will result in more problems during review.

The PR also seems to be doing two things: adding information about the big data policy, and fixing typos and changing build instructions in other sections. These are separate and the second part needs to be split out into another PR.

@yaswant
Copy link
Copy Markdown
Contributor Author

Good points, Sam Clarke-Green (@t00sa) - I initially thought this would be best placed in the Developer section, but since the reference to BIG_DATA was only in the Reviewer section, that's where it ended up. You're correct that some adjustments are needed. Regarding what's permitted, ANCILDIR-Deploy should be considered the single source of truth.

I'll revert the typo and jules-doc compilation updates and address them in another PR.

Comment thread source/Reviewers/howtocommit.rst Outdated
@yaswant
Copy link
Copy Markdown
Contributor Author

Thanks both.

Moved the guidance to development checklist section. Included process (extract from ANCILDIR-Deploy) here for wider visibility.

Copy link
Copy Markdown
Collaborator

@t00sa Sam Clarke-Green (t00sa) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments in-line

Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
If the change requires a new or updated file in ``LFRIC_DATA_DIR`` then you
will need to work with the Information Asset Owner (IAO) to ensure that data
in ``LFRIC_DATA_DIR`` must include clear attribution and licence metadata.
Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not to imply that this is optional:

Suggested change
Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see
This should follow existing UM ``ANCILDIR`` conventions (`see

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For LFRic, we prefer to include the metadata and license as NetCDF global attributes rather than storing them separately. This approach does not currently apply to UM ANCILDIR.

For non-NetCDF LFRic files, we follow UM ANCILDIR convention.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On balance, it would be better to remove this entire pull-out box and put the information in the Licences section. The different standards for netCDF and non-netCDF files could then be specified separately to make each point clearer.

I think it would also be useful to remove the comment about the LFRic data directory, because this policy applies regardless of where the files are being installed.

Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Apply CR suggestion.

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
@yaswant Yaswant Pradhan (yaswant) added this to the Summer 2026 milestone May 20, 2026
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you!

@yaswant Yaswant Pradhan (yaswant) changed the title Add attribution metadat policy for BIG_DATA Add attribution metadata policy for BIG_DATA May 29, 2026
Copy link
Copy Markdown
Collaborator

@t00sa Sam Clarke-Green (t00sa) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments inline. Given that the licencing of data is mandatory, the language should be unequivocal and the consequences of failing to engage with an IAO should be made clear in the introduction.

Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
If the change requires a new or updated file in ``LFRIC_DATA_DIR`` then you
will need to work with the Information Asset Owner (IAO) to ensure that data
in ``LFRIC_DATA_DIR`` must include clear attribution and licence metadata.
Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On balance, it would be better to remove this entire pull-out box and put the information in the Licences section. The different standards for netCDF and non-netCDF files could then be specified separately to make each point clearer.

I think it would also be useful to remove the comment about the LFRic data directory, because this policy applies regardless of where the files are being installed.

Comment thread source/Development/testdata.rst Outdated
Comment thread source/Development/testdata.rst Outdated
Comment on lines 17 to 18
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to describe these instructions as applying to all Met Office managed systems. That would also cover any test data which has to be copied to other platforms, e.g. jules data on jasmin, as well as UM AUX, Socrates spectral etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread source/Development/testdata.rst Outdated
Comment on lines +107 to +109
If you have questions about the process or concerns about the provenance of the
data you want to include, please engage with the IAO as early as possible to
prevent delays to your change later on.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be at the start of the document not the end.

It should also include an explicit statement that changes which depend on unlicenced data or data with an unknown provenance will not be approved until problems can be resolved. This will give greater clarity to both developers and reviewers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Yaswant Pradhan (yaswant) and others added 2 commits June 1, 2026 11:49
Apply reviewer suggestion.

Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants