- Issue created by @kevinquillen
-
kevinquillen β
committed 4704a73f on 1.0.x
Issue #3343862 by kevinquillen: Tweak the summarize/taxonomy suggester...
-
kevinquillen β
committed 4704a73f on 1.0.x
- πΊπΈUnited States kevinquillen
Committed some changes to dev. The results coming back are already 10x better than they were previously. Here is an example from my own content:
It has been far more accurate with every article I have tried.
It would be good to expand on this a bit here later and allow the user to select which longtext field to summarize, but for now this is working. I may be able to get to that part (selecting which field) next week.
- πΊπΈUnited States kevinquillen
It might be a good idea to implement a pass through DOMDocument too and just delete nodes that are pre or code formatted. I have noticed that code samples even though passed through strip_tags isn't cleanly removed and can interfere with summaries.
- Status changed to Needs work
about 2 years ago 5:44pm 23 February 2023 - πΊπΈUnited States kevinquillen
So far so good. Still getting good results.
I did notice that we may have an issue trying to remove special tokens like or ... may have to figure out how to handle that too.
- @kevinquillen opened merge request.
- πΊπΈUnited States d0t101101
@kevinquillen - I can also confirm that these tweaks to the OpenAI queries made a dramatic improvement for summarization and taxonomy generation, which is part of the openai_content sub module (which now appears on the node edit pages). Tested this across 10 different nodes with varying subjects and lengths; working great.
Well done, sir!
- πΊπΈUnited States kevinquillen
Ok, this is probably in a good enough position at the moment. I can go back and help other areas with the StringHelper (name subject to change) utility class, and implement the stopwords method that is currently in the queue worker and bring it all into one helper class.
-
kevinquillen β
committed c4f09af4 on 1.0.x
Issue #3343862 by kevinquillen: Tweak the summarize/taxonomy suggester...
-
kevinquillen β
committed c4f09af4 on 1.0.x
- Status changed to Fixed
almost 2 years ago 3:49pm 28 February 2023 - πΊπΈUnited States d0t101101
The latest 'Suggest Taxonomy' feature is great and quite powerful! Excellent English grammar skills over there with now requesting 'nouns and adjectives' only too :)
I've noticed that if you repeatedly 'Suggest Taxonomy' again and again, sometimes its in a numbered list, and other times its a comma separated list. Ideally this should be a comma separated list only so that it can be quickly copied and pasted into a Drupal Autocomplete Tags type of input. This seems to do the trick!
'Suggest five words to classify the following text. The words must be nouns or adjectives, comma separated:'
Automatically closed - issue fixed for 2 weeks with no activity.