Tagging file fields does not work when there is a filter on concept scheme

Created on 3 November 2021, about 3 years ago
Updated 21 August 2023, about 1 year ago

Problem/Motivation

When configuring PowerTagging to tag content based on PoolParty project the extractor does not work on file fields. I think this happens only when the taxonomy contains several concept schemes and the synchronization is configured for a single concept scheme.

Steps to reproduce

1. Install and configure semantic_connector and Powertagging
2. Link to a Drupal taxonomy to a single concept scheme from a PoolParty project with multiple concept schemes
3. Link the taxonomy to a content type field
4. Add a file field to the content type and configure the PowerTagging field to extract concepts from it (and other fields Title, Body)
5. Create and save a node with text and file (PDF) attached.
4. Enter the edit form and use the "Get Tags" button.

Expected behavior: Return a list of matching tags
Actual behavior: Message that there are not terms matched

I created a proof of concept which shows the error using only a configured PoolParty (replace projectId, use a test.pdf):

If the conceptSchemeFilters is commended below the extraction works, with it enabled it doesn't return any suggestion:

$variables = [];
$variables['headers']['Accept'] = 'application/json';
$variables['data'] = [
	'file' => new \CURLFile('test.pdf'),
	'language' => 'en',
	'projectId' => 'd9627451-064c-4d1b-bce7-e006e31d8235',
	'numberOfConcepts' => 20,
	'numberOfTerms' => 0,
	'corpusScoring' => 'corpus:d24ba856-c382-495c-bede-5a76ae6adb71',
	'conceptSchemeFilters' => ['https://vocabulary.test.org/Mastertest/3517']
];
$ch = curl_init();

$url = 'https://poolparty.org/extractor/7.1/api/extract';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $variables['data']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

echo "------ REQUEST ---------\n";
var_dump($variables);

$response_raw = curl_exec($ch);
$http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo "HTTP status code: {$http_code}\n";
echo "------ RESPONSE ---------\n";
if ($http_code != 200) {
	$error = curl_error($ch);
	var_dump($error);
}
var_dump($response_raw);
curl_close($ch);
🐛 Bug report
Status

Fixed

Version

1.0

Component

Code

Created by

🇷🇴Romania cristiroma

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024