Is it possible to use search api attachments to index files that are not attachments?

Created on 15 February 2023, over 1 year ago
Updated 21 February 2023, over 1 year ago

Problem/Motivation

Drupal Core 9.5.2. Search API 8.x-1.28. Assuming I am on Search API Attachments 8.x-1.0-* (can't see version when I look at Extend).
Using Solr 8.4.0 and Solr Extraction extraction method. Module works great for indexing pdfs attached to nodes and paragraphs.

I know this is search api "attachments", but I was wondering if it was possible to index files that are NOT attached to nodes or paragraphs?

I have more than 1,000 text files I want to upload to my site for search and retrieval. Trying to attach them to nodes will be a nightmare, and for my purposes, they don't need to be attached to nodes or paragraphs. The individual text files will contain references to their source documents. I just want users to be able to keyword search the texts, view the extracts and click on the ones they want to read. The way search api currently works in views.

I just want to upload the files to a subdirectory under my site's file directory and point solr api to index them. Currently, I do not know:

a. If this is even possible, and;
b. How do do it if it is possible.

Can someone assist?

Thanks!

p.s. I asked OpenAI ChatGPT this question, but do not think this is the correct answer: https://sharegpt.com/c/4JAAwjR

πŸ’¬ Support request
Status

Fixed

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States somebodysysop

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @somebodysysop
  • πŸ‡«πŸ‡·France izus

    hi,
    it is possible to not use nodes and only search in medias (Media entity)
    i'd suggest to use one of the contrib modules to bulk upload your files as media entities first
    didn't test but there is : https://www.drupal.org/project/media_bulk_upload β†’ and https://www.drupal.org/project/simple_media_bulk_upload β†’

    here is a tutorial that can go to README file if ok:

    SIMPLE USAGE EXAMPLE 3: PURE MEDIA FILE INDEX
    --------------------------------------------------------------------------
    0) This is tested with :
    drupal 9.3.0-beta3
    search_api 8.x-1.x
    search_api_attachments 8.x-1.x

    1) Install drupal, media, search_api search_api_db and search_api_attachments.

    2) Go to media/add and add some pdf Document files.

    3) Configure the extractor at admin/config/search/search_api_attachments and Go
    to admin/config/search/search-api/add-server and add server 'My server'
    (my_server) with the default Database Backend.

    4) Go to admin/config/search/search-api/add-index and add a new index 'My index'
    (my_index) with 'Media' as Data source and 'My server' as Server.
    Limit the bundles indexed on Document

    5) Go to admin/config/search/search-api/index/my_index/processors and enable
    the File attachments processor.

    6) Go to admin/config/search/search-api/index/my_index/fields/add/nojs and:
    - in the General section, add the "Search api attachments: Document (saa_field_media_document)" field.

    7) Go to /admin/config/search/search-api/index/my_index/fields to configure
    "Search api attachments: Document" to Fulltext.

    8) Go to admin/structure/views/add and add a Page view:
    - View name: SAA
    - View settings:Show: Index My index
    - Page settings: Check Create a page with title and path 'saa' that
    displays "Rendered entity" format.
    ("Search results" format seems not working for now)

    9) Add a filter to the view: the 'Fulltext search' with
    - Operator : Contains any of these words
    - Check the Expose checkbox
    Select the "Search api attachments: Document" in "Searched fields" Section.

    10) Go to admin/structure/views/view/saa and in the "Exposed Form" section (in
    the ADVANCED section), hit the 'Basic' link and choose 'Input required'
    so that the view doesn't display any default results.

    11) Go to admin/config/search/search-api/index/my_index and Index items.

    12) Go to /saa and search for any term in the pdf files :)

  • Status changed to Fixed over 1 year ago
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024