- Issue created by @marcus_johansson
- 🇬🇧United Kingdom MrDaleSmith
How confident are we that if we ask the AI to give us an explanation for why it has done what it has done, that it will actually give us one?
I'm nowhere near an expert on this, but my understand is that when you query an LLM it is generating a response based on the probability of the word it chooses to be next in the sentence being similar enough to ones that were used in similar questions in its context and training data? If you ask the AI for an explanation, would it need a degree of self-aware to have actually decided on a course of action and then be able to accurately explain its reasoning for choosing that response? Aren't you more likely to get back the most likely kind of response to the question "Why did you do that?"