Matcher doesn't ignore child elements if other than XPath #465

beatngu13 · 2019-10-15T16:05:13Z

According to our docs:

Elements are identified by one special attribute $key, so called identifying attributes. If you directly match an element with no additional expressions (e.g. attributes), all of its child elements are ignored too.
# Match the element where its attribute $key fully matches the $value
matcher: $key=$value

This works fine when using an XPath, for example:

matcher: xpath=html[1]/[body[1]/div[1]

But when using a different matcher (i.e. retestid, id, class or type), child elements aren't ignored anymore. To verify this issue:

Pick an arbitrary page.
Create a GM.
Change something (test fails).
Ignore the parent of the changed element via XPath.
Run again (test passes).
Change the ignore to something else as mentioned above.
Run again (test fails).

I assume this is because in ElementXPathMatcher we do:

@Override
public boolean test( final Element element ) {
	return element.getIdentifyingAttributes().getPath().startsWith( xpath );
}

This includes child elements because they share the same XPath prefix. However, this isn't the case for retestid, id, class or type. Here, we don't check if a parent has been ignored.

A possible solution: Let all matchers also be a ElementXPathMatcher (e.g. via inheritance). Whenever such a matcher actually matches—for instance, ElementIdMatcher finds an element with an ID that should be ignored—then we store the XPath too and incorporate this in subsequent invocations. Note that:

A matcher might matches multiple elements, consequently, we would have to store and compare multiple XPaths.
This approach requires ordering. Assume we want to ignore id=foo, where this element's XPath equals html[1]/[body[1]/div[1]/div[1]. I don't believe this is the case, but if e.g. we check one of the child elements before we store the element-to-be-ignored's XPath, it won't be ignored.

Other ideas?

The text was updated successfully, but these errors were encountered:

diba1013 · 2019-10-17T13:06:58Z

With my current understanding of your solution, that would extend the Matcher to do more than just match an element; it stores a state. With such an SoC design, the individual classes should not store a state, since they are essentially a Predicate<Element> and they do not know in which context they are being called—theoretically, we could cache those matchers upon parsing a filter line and reuse them.

However, the more significant problem would be, that matchers might also be executed together with an attribute (e.g. matcher: xpath=html[1]/[body[1]/div[1], attribute: foo) which would then also match all parents.

My alternative solution would be:

The origin, where the particular matcher for a single element (without attributes) is called, is the ElementFilter. We can do the parent matching:

@Override
public boolean matches( final Element element ) {
	Element parent = element;
	do {
		if ( matcher.matches(parent) ) {
			return true;
		}
		parent = parent.getParent()
	while ( parent != null );
	return false;
}

or with Java 9+ 😢

@Override
public boolean matches( final Element element ) {
	return Stream.iterate( element, Objects::nonNull, Element::getParent ) // .parallel() ?
		.anyMatch( matcher );
}

Things to consider:

This computation would be—in the worst case—propagated to the root element. If we have a deep enough tree (take youtube.com or any other big site for example), this may be an expensive computation, triggering the same computation multiple times (e.g. a parent with 100 children is evaluated 100+ times). This might require parallelization and caching, as all elements will be looked at.
The XPathMatcher should match fully, or—for performance reasons—it might be kept as it is.

beatngu13 · 2019-10-17T13:20:22Z

@diba1013 yeah, I really don't like the idea of a matcher enclosing state too. As you said, it then depends on the context it is being called from. A test capturing this behavior would look like:

assertThat( matcher ).rejects( child );
assertThat( matcher ).accepts( parent );
assertThat( matcher ).accepts( child );

This is super confusing.

We could walk the element tree upwards as you suggested, but as you already pointed out, this can be really, really expensive.

I think there are several ways to approach this and we shouldn't rush for a decision—although the bug is quite critical—but carefully overthink the solution we want to go with.

Regarding the XPath matcher: You mean equals instead of startsWith? If yes, I tend to disagree as we really can't change this at the moment. This breaks an existing contract, so I'm favor to do this as part of the next major.

diba1013 · 2019-10-17T13:49:04Z

Regarding the XPath matcher: You mean equals instead of startsWith? ...

Sure that should only be changed in conjunction with a solution to this bug. As stated, even then we might even consider leaving it as is in favor of performance, since string comparism is not fast for long strings, but considerably faster than walking the element tree (as we do not the complete Path and not just the PathElement). But even the current implementation of XPath (simple, absolute String) is cheap comparing to a potential solution of retest/recheck-web#132 or the full or even a partial XPath specification.

beatngu13 · 2019-10-18T06:12:31Z

Another approach:

Internal behavior: A Filter only cares about changes in regards to the given Element, never children or parents. It remains stateless and this would also mean changing the ElementXPathMatcher to switch from startsWith to equals.
External behavior: Overall, a Filter might match parents and/or children too. To do so for, e.g., the ElementXPathMatcher, we would simply iterate over all changes twice. First iteration to collect the data we need (e.g. XPath's for elements ignored by ID), second iteration to do the actual filtering.

This means linear time O(2‌n) = O(n), where n is the number of diffs in a test report. And we could compute this in a central place: TestReportFilter, which is used by recheck, recheck.cli and soon review too.

Downsides?

See: retest#465

beatngu13 added the bug Something isn't working label Oct 15, 2019

beatngu13 added a commit to beatngu13/recheck that referenced this issue Oct 20, 2019

Test parents too

dcf01d0

See: retest#465

beatngu13 added a commit to beatngu13/recheck that referenced this issue Oct 20, 2019

Test parents too

c373006

See: retest#465

Creepow mentioned this issue Oct 21, 2019

Feature/matcher consider parents #471

Merged

beatngu13 closed this as completed in #471 Oct 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matcher doesn't ignore child elements if other than XPath #465

Matcher doesn't ignore child elements if other than XPath #465

beatngu13 commented Oct 15, 2019

diba1013 commented Oct 17, 2019

beatngu13 commented Oct 17, 2019

diba1013 commented Oct 17, 2019

beatngu13 commented Oct 18, 2019

Matcher doesn't ignore child elements if other than XPath #465

Matcher doesn't ignore child elements if other than XPath #465

Comments

beatngu13 commented Oct 15, 2019

diba1013 commented Oct 17, 2019

beatngu13 commented Oct 17, 2019

diba1013 commented Oct 17, 2019

beatngu13 commented Oct 18, 2019