- Issue created by @marcus_johansson
- Assigned to marcus_johansson
Currently we do not cut off any long running agent reasoning at all, if it gets to long it would run out of tokens and die. This means that for real research issues our agents are working quite poorly at the moment.
For assistants we currently just cut of after X amount of sessions if you would like that enabled (or no memory).
There are a lot of different ways to handle this:
1. No memory - it will not even remember the last thing you said.
2. Just keep X amount of the newest messages, the problem is that the agent might repeat using a tool it already used or repeating answers it already answered.
3. Tokenize and just keep X amount of tokens - can be process heavy.
4. Just keep the latest X amount and summarize anything before that into X amount of tokens.
5. Embeddings - you can store the memory in temporary embeddings and get the information based on that.
6. Structured memory template - this is like summary, but it structures more into clear paths like goals, observations, actions, reasoning etc.)
7. And probably 50 other things
We need to make this into something that you can extend any chatbot, agent or other thing that actually works with threads.
Note that ShortTermMemoryPlugin, could be used for storing threads over longer time. The memory definition is if its a memory that has been built over longer time or not: https://superagi.com/introduction-to-agent-summary-improving-agent-outpu...
We will also introduce LongTermMemoryPlugin, but this is a little bit more complex, since its a learning algorithm and from privacy point of view, since you are profiling users - where if you for instance have bought blue dresses before, and ask for a t-shirt it will have a profile of you wanting blue things and give back blue t-shirts.
If the AI Assistants API for instance wants to use this, it will
Active
1.2
AI Core module