- πΊπΈUnited States Chris Burge
@SomebodySysop are you seeing anything in the logs?
Just looking through the code,
TextExtractorPluginBase::getRealpath()
is of interest to me:/** * Helper method to get the real path from an uri. * * @param string $uri * The URI of the file, e.g. public://directory/file.jpg. * * @return mixed * The real path to the file if it is a local file. An URL otherwise. */ public function getRealpath($uri) { $wrapper = $this->streamWrapperManager->getViaUri($uri); if($wrapper != FALSE){ $scheme = $this->streamWrapperManager->getScheme($uri); $local_wrappers = $this->streamWrapperManager->getWrappers(StreamWrapperInterface::LOCAL); if (in_array($scheme, array_keys($local_wrappers))) { return $wrapper->realpath(); } else { return $wrapper->getExternalUrl(); } } }
I'm wondering if the method is failing to return a usable value here.
- πΊπΈUnited States somebodysysop
Thanks for the response. I have not looked at it in over a year since I don't know enough about the module to develop a patch. It was my hope that someone with much more knowledge would stumble upon this and figure out a solution.
This patch will resolve the issue, but please ensure that your site's domain has access to the S3 bucket.
- π³πΏNew Zealand ericgsmith
We've been using this module with s3fs for a long time with no additional patches or code changes needed but from memory solr needs to be configured to allow remote streaming
- π³πΏNew Zealand ericgsmith
Went back to have a look at the project we were using for this.
Originally when using Solr 8.x we had
enableRemoteStreaming
set totrue
through some custom request dispatcher config - something like:search_api_solr.solr_request_dispatcher.request_dispatcher_remote_streaming.yml:
uuid: .... langcode: en status: true id: request_dispatcher_remote_streaming label: 'Remote Steaming' minimum_solr_version: 7.0.0 environments: { } recommended: true request_dispatcher: name: requestParsers enableRemoteStreaming: true multipartUploadLimitInKB: -1 formdataUploadLimitInKB: -1 addHttpRequestToContext: true
In later solr version this changed to being enabled by an environment var - so now we just have an environment variable:
SOLR_OPTS: "-Dsolr.enableRemoteStreaming=true"
But before fill you with false hope - we were using S3FS module but with the public file takeover, meaning the bucket is publicly accessible and the external URL is used. I haven't tested with the non public wrapper.