Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pattern regex adjustment #1964

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

olivroy
Copy link
Collaborator

@olivroy olivroy commented Mar 5, 2025

Summary

2 tests are failing which I need to investigate, but opening in case you know how to fix.

Basically, the regexp was wrong in this case which caused an infinite loop. So, I added a condition which makes the case where no match occur error early. Just need to figure out why the test is failing and also how I could simplify the test.

Also needs news.

Seems like this creates problems for more than 2 <<

dplyr::tibble(
      a = 1:4,
      b = c(1, NA, 3,  NA),
      c = c(1, 2,  NA, NA),
      d = c("1", "2", NA_character_, NA_character_),
      e = c(TRUE, FALSE, NA, NA)
    ) |> gt() |>
    cols_merge(columns = c(a, b, c), pattern = "<<{1}<<{2}<<{3}>>>>>>")

Should get c("111", "2", "33", "4"), but we are erasing everything as soon as there is NA in the first column.

dplyr::tibble(
      a = 1:4,
      b = c(1, NA, 3,  NA),
      c = c(1, 2,  NA, NA),
      d = c("1", "2", NA_character_, NA_character_),
      e = c(TRUE, FALSE, NA, NA)
    ) |> gt() |>
    cols_merge(columns = c(a, b, c), pattern = "{1}<<{2}<<{3}>>>>")

Should get c("111", "2", "33", "4") in the first column..

Related GitHub Issues and PRs

Checklist

@rich-iannone
Copy link
Member

Oh yes, I discovered problems with pattern and nesting like this quite a while back but didn't get to writing up an issue about it. Thank you for taking this one on (while loops are dangerous).

@olivroy
Copy link
Collaborator Author

olivroy commented Mar 5, 2025

Yeah.. I have attempted to fix this before, but I never got it. I am out of idea...

Also, how is pattern = "<<{1}<<{2}<<{3}>>>>>>" different from "<<{1}>><<{2}>><<{3}>>". All other cases seem to render fine, but the regex I am using here attemps to exclude too many consecutive <<

If they are the same, maybe we could attempt to normalize <<{1}<<{2}<<{3}>>>>>> to <<{1}>><<{2}>><<{3}>> with a deprecation warning?
(that seems tedious and not straightforward)

Another idea I had was to change < in pattern to "::opening_secondary::" and > to "::closing_secondary::" but I didn't try this..

@rich-iannone
Copy link
Member

Also, how is pattern = "<<{1}<<{2}<<{3}>>>>>>" different from "<<{1}>><<{2}>><<{3}>>"

I believe the idea was that if {2} is missing, both {2} and {3} would be removed. Some examples:

x = 1, y = 2, z = 3

"<<{x}<<{y(NA)}<<{z}>>>>>>" -> "1"
"<<{x}<< [{y}]<<({z(NA)})>>>>>>" -> "1 [2]"
"<<{x(NA)}<<{y}<<{z}>>>>>>" -> ""

A sort of weird idea to make this work is to substitute an uncommon literal character in a run of ">>>>>>" at every second position (making ">>|>>|>>") and then removing that literal character (probably shouldn't be a | though) as the final step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When pattern manages NAs in fmt_icon() |> cols_merge(), NA gets repeatedly appended to the value.
2 participants