Problem/Motivation
When configuring PowerTagging to tag content based on PoolParty project the extractor does not work on file fields. I think this happens only when the taxonomy contains several concept schemes and the synchronization is configured for a single concept scheme.
Steps to reproduce
1. Install and configure semantic_connector and Powertagging
2. Link to a Drupal taxonomy to a single concept scheme from a PoolParty project with multiple concept schemes
3. Link the taxonomy to a content type field
4. Add a file field to the content type and configure the PowerTagging field to extract concepts from it (and other fields Title, Body)
5. Create and save a node with text and file (PDF) attached.
4. Enter the edit form and use the "Get Tags" button.
Expected behavior: Return a list of matching tags
Actual behavior: Message that there are not terms matched
I created a proof of concept which shows the error using only a configured PoolParty (replace projectId, use a test.pdf):
If the conceptSchemeFilters is commended below the extraction works, with it enabled it doesn't return any suggestion:
$variables = [];
$variables['headers']['Accept'] = 'application/json';
$variables['data'] = [
'file' => new \CURLFile('test.pdf'),
'language' => 'en',
'projectId' => 'd9627451-064c-4d1b-bce7-e006e31d8235',
'numberOfConcepts' => 20,
'numberOfTerms' => 0,
'corpusScoring' => 'corpus:d24ba856-c382-495c-bede-5a76ae6adb71',
'conceptSchemeFilters' => ['https://vocabulary.test.org/Mastertest/3517']
];
$ch = curl_init();
$url = 'https://poolparty.org/extractor/7.1/api/extract';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $variables['data']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
echo "------ REQUEST ---------\n";
var_dump($variables);
$response_raw = curl_exec($ch);
$http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo "HTTP status code: {$http_code}\n";
echo "------ RESPONSE ---------\n";
if ($http_code != 200) {
$error = curl_error($ch);
var_dump($error);
}
var_dump($response_raw);
curl_close($ch);