Problem/Motivation
Use case: we need to store HTML data in a field, and perform search filters and sorts on the same field, ignoring the HTML characters.
This can be achieved in elasticsearch by adding an analyzer, for example:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"char_filter": ["html_strip"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"plain_text": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
}
Steps to reproduce
N/A
Proposed resolution
Allow the analysis
configuration to be provided in a pipeline yaml file:
my_pipeline:
label: 'My Pipeline'
class: '\Drupal\data_pipelines_elasticsearch\ElasticSearchDatasetPipeline'
analysis:
analyzer:
my_analyzer:
tokenizer: keyword
char_filter:
- html_strip
mappings:
properties:
name:
type: text
fields:
keyword:
type: keyword
plain_text:
type: text
analyzer: my_analyzer
Remaining tasks
Test coverage, reviews, etc
User interface changes
API changes
Data model changes