Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicitly set label to node id #167

Merged
merged 1 commit into from
Dec 12, 2024
Merged

explicitly set label to node id #167

merged 1 commit into from
Dec 12, 2024

Conversation

whelena
Copy link
Collaborator

@whelena whelena commented Dec 2, 2024

Description

Make sure label is set to node id when its missing from the tree dataframe

Checklist

  • This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
    Disclosing PHI is a major problem1 - Even a small leak can be costly2.

  • This PR does NOT contain germline genetic data3, RNA-Seq, DNA methylation, microbiome or other molecular data4.

  • This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g. .png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.

  To automatically exclude such files using a .gitignore file, see here for example.

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have set up or verified the main branch protection rule following the github standards before opening this pull request.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

Footnotes

  1. UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records

  2. The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records.

  3. Genetic information is considered PHI.
    Forensic assays can identify patients with as few as 21 SNPs

  4. RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.

@whelena whelena changed the title explicitly set label if not specified explicitly set label to node id Dec 2, 2024
@whelena whelena requested a review from dan-knight December 2, 2024 21:51
Copy link
Collaborator

@dan-knight dan-knight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we don't want to just set the rownames for compatibility with SRCGrob?

@whelena
Copy link
Collaborator Author

whelena commented Dec 11, 2024

Any reason why we don't want to set the rownames for compatibility with SRCGrob?

I'm setting rownames is set to node.id column if it is specified, which is compatible with SRCgron. Then, label can be different from node.id but if its not specified it should default to node.id.

Basically without the line I added, the label defaults to an ordered, numeric thing even if node.id is not numeric

@dan-knight
Copy link
Collaborator

Interesting. I remember this issue, but I'd have thought that SRCGrob would handle it automatically. If not, this makes sense.

@whelena
Copy link
Collaborator Author

whelena commented Dec 12, 2024

Yea, it was able to make the tree with the right edge connections but somehow the labels are still in numerical order.

@whelena whelena merged commit 6182709 into main Dec 12, 2024
6 checks passed
@whelena whelena deleted the hwinata-fix-tree-label branch December 12, 2024 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants