Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matcher doesn't ignore child elements if other than XPath #465

Closed
beatngu13 opened this issue Oct 15, 2019 · 4 comments · Fixed by #471
Closed

Matcher doesn't ignore child elements if other than XPath #465

beatngu13 opened this issue Oct 15, 2019 · 4 comments · Fixed by #471
Labels
bug Something isn't working

Comments

@beatngu13
Copy link
Contributor

According to our docs:

Elements are identified by one special attribute $key, so called identifying attributes. If you directly match an element with no additional expressions (e.g. attributes), all of its child elements are ignored too.

# Match the element where its attribute $key fully matches the $value
matcher: $key=$value

This works fine when using an XPath, for example:

matcher: xpath=html[1]/[body[1]/div[1]

But when using a different matcher (i.e. retestid, id, class or type), child elements aren't ignored anymore. To verify this issue:

  1. Pick an arbitrary page.
  2. Create a GM.
  3. Change something (test fails).
  4. Ignore the parent of the changed element via XPath.
  5. Run again (test passes).
  6. Change the ignore to something else as mentioned above.
  7. Run again (test fails).

I assume this is because in ElementXPathMatcher we do:

@Override
public boolean test( final Element element ) {
	return element.getIdentifyingAttributes().getPath().startsWith( xpath );
}

This includes child elements because they share the same XPath prefix. However, this isn't the case for retestid, id, class or type. Here, we don't check if a parent has been ignored.

A possible solution: Let all matchers also be a ElementXPathMatcher (e.g. via inheritance). Whenever such a matcher actually matches—for instance, ElementIdMatcher finds an element with an ID that should be ignored—then we store the XPath too and incorporate this in subsequent invocations. Note that:

  1. A matcher might matches multiple elements, consequently, we would have to store and compare multiple XPaths.
  2. This approach requires ordering. Assume we want to ignore id=foo, where this element's XPath equals html[1]/[body[1]/div[1]/div[1]. I don't believe this is the case, but if e.g. we check one of the child elements before we store the element-to-be-ignored's XPath, it won't be ignored.

Other ideas?

@beatngu13 beatngu13 added the bug Something isn't working label Oct 15, 2019
@diba1013
Copy link
Contributor

With my current understanding of your solution, that would extend the Matcher to do more than just match an element; it stores a state. With such an SoC design, the individual classes should not store a state, since they are essentially a Predicate<Element> and they do not know in which context they are being called—theoretically, we could cache those matchers upon parsing a filter line and reuse them.

However, the more significant problem would be, that matchers might also be executed together with an attribute (e.g. matcher: xpath=html[1]/[body[1]/div[1], attribute: foo) which would then also match all parents.

My alternative solution would be:

The origin, where the particular matcher for a single element (without attributes) is called, is the ElementFilter. We can do the parent matching:

@Override
public boolean matches( final Element element ) {
	Element parent = element;
	do {
		if ( matcher.matches(parent) ) {
			return true;
		}
		parent = parent.getParent()
	while ( parent != null );
	return false;
}

or with Java 9+ 😢

@Override
public boolean matches( final Element element ) {
	return Stream.iterate( element, Objects::nonNull, Element::getParent ) // .parallel() ?
		.anyMatch( matcher );
}

Things to consider:

  • This computation would be—in the worst case—propagated to the root element. If we have a deep enough tree (take youtube.com or any other big site for example), this may be an expensive computation, triggering the same computation multiple times (e.g. a parent with 100 children is evaluated 100+ times). This might require parallelization and caching, as all elements will be looked at.
  • The XPathMatcher should match fully, or—for performance reasons—it might be kept as it is.

@beatngu13
Copy link
Contributor Author

@diba1013 yeah, I really don't like the idea of a matcher enclosing state too. As you said, it then depends on the context it is being called from. A test capturing this behavior would look like:

assertThat( matcher ).rejects( child );
assertThat( matcher ).accepts( parent );
assertThat( matcher ).accepts( child );

This is super confusing.

We could walk the element tree upwards as you suggested, but as you already pointed out, this can be really, really expensive.

I think there are several ways to approach this and we shouldn't rush for a decision—although the bug is quite critical—but carefully overthink the solution we want to go with.

Regarding the XPath matcher: You mean equals instead of startsWith? If yes, I tend to disagree as we really can't change this at the moment. This breaks an existing contract, so I'm favor to do this as part of the next major.

@diba1013
Copy link
Contributor

Regarding the XPath matcher: You mean equals instead of startsWith? ...

Sure that should only be changed in conjunction with a solution to this bug. As stated, even then we might even consider leaving it as is in favor of performance, since string comparism is not fast for long strings, but considerably faster than walking the element tree (as we do not the complete Path and not just the PathElement). But even the current implementation of XPath (simple, absolute String) is cheap comparing to a potential solution of retest/recheck-web#132 or the full or even a partial XPath specification.

@beatngu13
Copy link
Contributor Author

Another approach:

  • Internal behavior: A Filter only cares about changes in regards to the given Element, never children or parents. It remains stateless and this would also mean changing the ElementXPathMatcher to switch from startsWith to equals.
  • External behavior: Overall, a Filter might match parents and/or children too. To do so for, e.g., the ElementXPathMatcher, we would simply iterate over all changes twice. First iteration to collect the data we need (e.g. XPath's for elements ignored by ID), second iteration to do the actual filtering.

This means linear time O(2‌n) = O(n), where n is the number of diffs in a test report. And we could compute this in a central place: TestReportFilter, which is used by recheck, recheck.cli and soon review too.

Downsides?

beatngu13 added a commit to beatngu13/recheck that referenced this issue Oct 20, 2019
beatngu13 added a commit to beatngu13/recheck that referenced this issue Oct 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants