Update es_attachment pipeline to remove binary data correctly

Created on 25 April 2024, 7 months ago
Updated 26 April 2024, 7 months ago

Problem/Motivation

At the moment "es_attachment.data" is added in the "excludes" option of the index configuration. This works fine when the attachments are sent in a single request but it breaks when attachments are added one by one, usually happens when having big files which cannot be sent at once.
Because of this, the es_attachment pipeline should be build something like this:

PUT _ingest/pipeline/es_attachment
{
  "description" : "Extract attachment content",
  "processors": [
    {
      "foreach" : {
        "field" : "es_attachment",
        "ignore_failure" : true,
        "ignore_missing" : true,
        "processor" : {
          "attachment" : {
            "ignore_failure" : true,
            "ignore_missing" : true,
            "target_field" : "_ingest._value.attachment",
            "field" : "_ingest._value.data",
          }
        }
      }
    },
    {
      "foreach" : {
        "field" : "es_attachment",
        "ignore_failure" : true,
        "ignore_missing" : true,
        "processor" : {
          "remove": {
            "ignore_failure" : true,
            "ignore_missing" : true,
            "field": "_ingest._value.data"
          }
        }
      }
    }
  ]
}
Feature request
Status

Needs review

Component

Code

Created by

🇷🇴Romania ciprian.stavovei

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024