You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because the harvester is looking explicitly for "a href", anything that doesn't exactly follow that string ordering will fail to harvest? Is there any reason why a proper XML parsing library isn't used when finding links instead of using a parsing library, which has known pitfalls when parsing XML?
Also, on the above link, the "apache" parser is used due to the "Server" header, even though this is clearly not an Apache directory listing, but rather a reverse proxied application. This was difficult to track down when I had to create custom logic for the "other" parser to account for some of the shortcomings of the WAF parser mentioned above.
The text was updated successfully, but these errors were encountered:
WAF harvesting can fail to parse on numerous things which are a de facto a WAF, such as this listing: https://gcoos4.tamu.edu/erddap/metadata/iso19115/xml/
Because the harvester is looking explicitly for "a href", anything that doesn't exactly follow that string ordering will fail to harvest? Is there any reason why a proper XML parsing library isn't used when finding links instead of using a parsing library, which has known pitfalls when parsing XML?
Also, on the above link, the "apache" parser is used due to the "Server" header, even though this is clearly not an Apache directory listing, but rather a reverse proxied application. This was difficult to track down when I had to create custom logic for the "other" parser to account for some of the shortcomings of the WAF parser mentioned above.
The text was updated successfully, but these errors were encountered: