I have solr and tika up and working, I can index PDFs, which is great.
However, on the search result page, I am only getting the parent entity as a link and not a link to the specific page in the PDF that contained the search term. I'm not sure if this is a limitation of this module, solr, tika, or if I am just not using it correctly.
When I test tika at the command line it returns a html document based on the PDF, perhaps it is the case that the location of the text in the PDF is not available to Tika?
Different browsers treat PDFs differently, so it's possible that this limitation exists because there may not be a standard approach to link to a specific page and phrase in a PDF from a website.
I have come up with a workaround, by embedding PDF.js on the tax term page(files are attached to tax terms in this case) and preloading PDF.js search ability with the search phrase from the search page. It works pretty good, but the client wants every instance of the term from the PDF on the search result page with a link to that specific page in the pdf. I am just not sure if it can work that way.
Thanks for your help!
Closed: won't fix
1.6
Miscellaneous
Not all content is available!
It's likely this issue predates Contrib.social: some issue and comment data are missing.