Recent comments

πŸ‡ΊπŸ‡ΈUnited States diegopino

This also happens on field_ui_table (View Modes). I have the feeling this is related to this here

https://github.com/jsonrainbow/json-schema/issues/407 which is still an open issue.

My solution was to use the component twig (the content) directly on field-ui-table.html.twig instead of using the component. I wonder if for some good reasons components are not always the only solution for everything, at least until all the quirks at core level are solved.

πŸ‡ΊπŸ‡ΈUnited States diegopino

I will make a fork/patch this week. I want to better understand what are the date time class limits of Drupal for a test (and will provide a test too once I do so).

We work with Cultural heritage( so our dates can span thousands if not million years into the past when dealing with archeological/paleontological artifacts) and for that we use custom fields/proper Drupal typed data sub properties based https://github.com/ProfessionalWiki/EDTF parsing, so not Drupal UI normal input values you might find in standard Drupal websites.

The values are correct 64bit PHP dates though. Just to be clear, not asking for support of those dates (would require a complete overhaul of Drupal's time logic), just to handle them as other dates are being handled

Thanks a lot for your feedback and time to review my issue

πŸ‡ΊπŸ‡ΈUnited States diegopino

Thanks for your reply @mkalkbrenner.

The exception thrown at the backend/index level is not in question, at least as a last resort given that on a date out of range date time range we are sending a "[ TO ]" to solr, which is wrong . From a perspective of an user indexing data, it is not something and enduser can act on to correct the issue (or conform to date/time limitations of Drupal, which are not Solr limitations for date range values). Also it is not "silent", the date time parser will already log that the date could not be parsed.

But I think my point is valid because the code is already doing what I believe is the right behavior (and the proposed solution) for dates here

https://git.drupalcode.org/project/search_api_solr/-/blob/4.x/src/Plugin...

case 'date':
            $value = $this->formatDate($value);
            if ($value === FALSE) {
              continue 2;
            }
            break;

So it would not hurt to do


          case 'solr_date_range':
            $start = $this->formatDate($value->getStart());
            $end = $this->formatDate($value->getEnd());
            if ($start === FALSE || $end === FALSE) {
                 continue 2;
            }
            $value = '[' . $start . ' TO ' . $end . ']';
            break;

Thanks.

πŸ‡ΊπŸ‡ΈUnited States diegopino

Hi Folks,

I agree exceptions are more manageable (when actually managed) and making tests happy (specially when testing a specific method) is great, but this really changes the behavior (from the user perspective) throwing an uncaught exception when rendering inline templates or templates via render elements that allow so, not giving an end user enough info and thus the chance to act on an invalid passed array as value.

With the previous user_error \Drupal\Core\Render\Renderer::doRender when calling

// Get the children of the element, sorted by weight.
$children = Element::children($elements, TRUE); // Line 462 of Renderer.php

We would just get as output a bunch of errors that an user/admin could act on, but still rendering everything else on screen.

Now because the exception is not caught one basically gets a white page. From that perspective, what is your suggestion? Adding try/catch when calling this method in Renderer? And acting on that via the previous (or similar behavior) but on the caller (or callers)? Or is a failure to render the new expected behavior?

For some good reasons I agree with one of the comments shared here, just before the change was committed.

"Actually given that this isn't really an error, we can just ignore and carry on, then I think an assertion is more appropriate?"

Any suggestions would be appreciated. Thanks a lot

πŸ‡ΊπŸ‡ΈUnited States diegopino

This bug is still around (6 years!) in 11.x. but both patches are failing bc the tests are not passing once the actual "excluded" elements are no longer part of the raw data and thus removed from the URLs

e.g here
https://git.drupalcode.org/project/drupal/-/blob/11.x/core/modules/views...

and here at
ExposedFormRenderTest::testExposedFormRawInput

The Layout builder one might be just layout builder failing in 10.1 so all needs to be rebased to 11.x-dev
@quietone since you are already in this, do you want to tackle that or you ok with me giving it a shot?

πŸ‡ΊπŸ‡ΈUnited States diegopino

@mkalkbrenner thanks for your quick reply.

Our way of producing vectors (embedding extraction) is for sure not the standard way. We have have a chain-able and configurable post processor plugin system for our custom type of fields/data that runs as a set of "extractors", from OCR, to file transforms, to vectors, in this case that are pushed into a background processing queue, then injected into custom datasources. The number of moving parts is kinda huge and does not feel the type of project you would like to mimic for this.

But, going back to the idea of plugins. I believe, that people (users and devs) using your module would be more comfortable using the existing search api processor idea. Since indexing already happens (most of the time at least) via cron or via drush, the overhead of calling an external service (well in our case it is external to Drupal not but no external in the sense of a commercial API) would be not huge. I mean we enqueue and have workers for everything but that is a choice. Why an extra plugin additionally/on top of to just a new processor?

Because you want to reuse the "processing/remote API call -> return as vector" logic also on query time. So a Views filter would need to be able to call the same logic used to index a certain vector using the same API. Vectors are opinionated, a one vector generated by X won't make any sense in relation to one generated by Y. Also here vector dimension is key. Fixed, never variable and lastly, depending on the comparison algorithm you might want to provide a normalized Unit Vector so you can use the faster dot_product instead of cosine (which again is a fixed setting for the Field type)

So, resuming (my 25cents). A post processor (e.g like the aggregated field one, or the Entity renderer one) that takes as argument/another type of very opinionated Plugin as config. These plugins would have standard methods (but opinionated internal logic) to call APIs using an input (in this case the same as a normal processor would have) and return a vector (array) and fixed annotations with vector size, etc. That way devs can write their own plugins that talk/understand/provide the needed logic that will vary a LOT for each remote service and also plug the same logic (which needs to be available outside of the processor itself) when querying to transform the input into a vector.

I see what you are doing on search_api_clir and it is very interesting.

πŸ‡ΊπŸ‡ΈUnited States diegopino

hi @mkalkbrenner, our project has a need for this and I'm willing to give this a try but, to align with your roadmap I need some pointers.
First a bit of background. We already have tons of external (Drupal and non Drupal) supporting code and some good experience altering/acting on events on this wonderful module to use custom Solr types, custom data sources, JOINS, etc. e.g The way we alter highlighting allowing use to use Fields that are driven by external Solr plugins that require different different query arguments, etc.

1.- So, from the perspective of actual implementation, First we need to put the data in :)
Because the Dense Vector Types are pre-set with a fixed comparison algorithm and a fixed vector Size per type we are right now defining 4 types with vector sizes of 384(Bert/Text embeddings), 512 (Apple Vision Image Fingerprint), 576 (Yolo Embeddings) and 1024 (mobileNet Embeddings). I believe as part of a release a 384 one should be sufficient and anyone else could then extend providing their own.

The first issue is the mismatch of cardinality and the field generation. A Vector, when passed from PHP to Solr is an array (so multivalued, fixed size based on the Field Type config), but goes always into a single value Field into Solr (multivalued=FALSE), the dynamic field generation \Drupal\search_api_solr\Entity\SolrFieldType::getDynamicFields is blind to this need.
Question is (or what would you suggest)
- Add a new Field type Config setting e.g like $this->custom_code $this->cardinality, allowing a Field Type to "ask" for no Dynamic Fields outside of what its type allows (in the case of a Vector of course Single Valued only). This could be useful for future types/other fields driven by custom solr plugins that have that need. Could be also directly a full Solr field settings override. Where a Field Type could "ask" for handling how the field is generated completely via a config
- OR a fixed method like getSpellcheckField() (e.g getDenseVectorField() that targets specifically Dense Vectors)
- OR an event that allows any external module to alter the dynamics fields (delegating the actual support and extra configs to anyone willing to write an event subscriber)

Second issue: Let's say we have now a dynamic, single valued field for one of these custom field type. And I want to setValue for the field.
The datatype at the PHP level will be an array (multivalued), mismatching the data type at the backend. So question is
- Do we need a new @SearchApiDataType ? that allows a Vector. Any other work arounds?

I think the how/one/generates/populates the Vectors both on index time/query time are beyond a first implementation in this module. We, for example, have a Docker container that processes Images and generates a custom datasource populated with this data (and NLP, HOCR). But that will vary a lot between users. Some might want to add this type of fields as a Processor.

At query time:
Our hack for custom queries has been to "set EDISMAX" dynamically via a custom Views Filter and add a custom option to the query. EDISMAX because it is the current Parser that alters less/is less opinionated of all of them. Then we intercept all at PostConvertedQueryEvent subscriber, check if a given option was passed, if so we remove the edismax component from the Solarium query and add all our custom logic. This allowed us in the past to do subqueries, JOINS, etc. But for an official implementation, I wonder if having a custom Parse Plugin would be ideal. The only issue I see with that (And Views integration) is that it will have to interact with a Normal Filter/Facets but use them as Pre Filter in a !knn query. And Solr also recommends 3 different options, pre filter, re-ranking too, and a "must" compound query. And this custom parser makes no sense used in an exposed Filter in a Views. Ideas?

That is what I have so far. I think the issue is not really coding this (testing might be a challenge but then your current tests are excellent, most of what I have learned from this module is reading your tests) but knowing what is worth tapping in, to what degree this module needs to cover all, or just allow the flexibility to override some things and provide the basics.

Thanks

πŸ‡ΊπŸ‡ΈUnited States diegopino

Hi @Chi, you are totally right. Please feel free to close this issue. This was my mistake, I had another root composer dependency blocking the upgrade and the composer messages confused me. ^10.0 is correct in the sense of 10.1 and 10.2 being allowed. But really, semantically speaking 10.1 has breaking changes compared to 10.0 so in that sense (unrelated to this module) it is not. Again, sorry and thanks

Production build 0.71.5 2024