-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wishlist: date support #34
Comments
I'll have to refresh my memory on this, but I think there's some reason why range limits end up being a little tricky (maybe something about how to seed the initial start / end dates?), but 👍 if you can make it work. |
Yeah, an ideal implementation would have cached up the min and max values for all sortable fields to provide concrete start/end values (also caching all top level facets, while we're at it). This would occur at server spin-up (de facto index-warming query). Barring that, configurable endpoints would satisfy (e.g., for most public libraries, |
I also can't remember the details of why the Solr range facetting (the start/end/gap thing isn't limited to 'dates', right?) wasn't going to work right, but also remember there was some issue. Some things:
(Of course, another option would be submitting a patch to solr itself to use the human-friendly ranges. It seems like something Solr ought to do. The code here that creates human-friendly ranges was originally ported from Flot, which has an MIT-style license). Personally, I would want the existing functionality -- both ranges calculated within the current result set, not just globally; and human-friendly ranges. Which is why the plugin is currently written how it is. Also, yet another issue -- you talk about Solr date fields specifically -- the current code actually works on int-like values, not dates. We use it with years, but put the years in a Solr int field. It could be enhanced to work with actual date fields, but it's a little bit tricky to plot them properly with Flot (the javascript graph-drawing library they use), and then translate back in the other direction when applying limits from the Flot graph. Again, I think this is a third orthogonal concern, which is actually independent to the other two. The solr auto-segmented range queries work on date as well as int, so you could use the on an int-like field too. |
Come to think of it, the first bullet point could also be a patch to solr -- use the min/max within the given result set, instead of making the client give you an explicit min/max for auto-segmented ranges. It would still require two lucene queries under the hood, but with a patch to Solr could be one Solr query. I think patches to Solr would be the best way to handle both bullet points, but I'm not up to the Java myself. I think a PR to this code would be possible alternatively, if you really want the behavior you mention, but should probably be an alternative to the existing behavior configurable, rather than replacing the existing behavior; I prefer the existing behavior. |
@jrochkind, I don't think you get how much of what you are talking about is already in Solr. The contrast is between "range faceting" as implemented here by separate facet queries (
To your first point, nothing about this stops working when you are "within" a query. Facets that are not query-specific sorta miss the point. At this point, I only care about dates. Assuming we can suppress leading and trailing 0-valued facet counts, code-level default range of 0000-01-01 to Solr's (int) range faceting now descends from the early implementation of date faceting. For details, check this solr.pl post from 2010. |
Thanks Joe. I don't understand how you'd avoid two Solr querries, if you want the segments and display to be within the min/max of the current search results -- you have to get the min/max from the first query, then make another one with that min/max set for facet segments one way or another. Unless there's a feature I don't know about? I think the current human-friendly segments implemented client-side go past what Solr can auto-segment. The current code tries to divide into roughly N segments whose boundaries are 2,5,10 (or multiples of those factors), and which end on even 0 or 5 or multiple-of-2 boundaries. If the range of the current search results is 12 years, the current code might divide into segments of 2 years, or for wider ranges 5 years, or 20 years, etc. Which I think is really nice -- but maybe what Solr can do is good enough, it's not neccesary? At any rate, PR is of course welcome. My interest is just in maintaining the current behavior of 1) dividing into segments within the min/max of the actual search results not the entire index, and 2) making those segments have human-friendly widths/boundaries, not just mathematical range/N. If you can do that all Solr-side, that would be sweet. If not, but there may be times when someone might prefer to trade-off 'ideal' UI for Solr efficiency, then I'd suggest the current behavior remain as a configurable option, probably on a per-field basis. |
Yes, I think start/end/gap delivers what you want in a way that is different than than this implementation currently. The container width logic need not be materially different. If you want boundaries of 2, 5 or 20 years, you can specify that per query. The min/max can come from four places:
|
Solr date range faceting (start/end/gap) is more performant than trying to manually build facet query ranges and allows for the simplest possible way to drill down. It seems like a great match for the interface you have here.
Example query:
Partial response:
Example shows both styles. You can see why the
facet.query
enumeration would get tedious. Perhaps we can use this issue to identify what existing impediments to implementation are?Note: this is all without getting into the new (5.x) Solr DateRangeField.
The text was updated successfully, but these errors were encountered: