Hi @nick_vh, @yashn_99!
I recently came across the project description for developing a chat-bot style interface for Drupal, and I am extremely interested in participating in this project. Given my background in natural language processing (NLP) and my prior experience working on similar projects, I believe my skills could be a significant asset to the team.
Allow me to introduce myself: my name is Yuntao, and I am currently pursuing a Master's degree in Computer Engineering in the United States. I have been involved in NLP projects in various capacities. For example, I worked as a research assistant at Dartmouth College where I led the development of a novel HCI approach to improve typing efficiency and user experience. This project involved model training, dataset construction, and experiment design. I also independently researched and built a medical dialogue system which aimed to act as a doctor and interact with patients through natural language. I compared and analyzed different retrieval-based and generation-based approaches to achieve optimal performance.
Considering my experience in NLP and dialogue systems, I am eager to contribute to the Drupal chatbot project. I am excited by the opportunity to create a new interaction model for learning Drupal, evaluate the potential of AI systems in open source initiatives.
Specifically, for this project, my initial idea was to fine-tune the OpenAI GPT-3 model by pre-processing the Drupal FAQ, documents, etc. to conform to the format needed for training the model, and then fine-tune the model by calling the OpenAI API to make the AI model capable of answering Drupal questions.
However, for the model fine-tune in the first step, since all of the work should be FOSS, I have a alternative.
A solution that combines both generation and retrieval (what I call the Integration Method).
Use Seq2Seq or HRED or GPT-2 or DialogGPT open-source AI mini-models(I said "mini" at here is because these models' parameter is much more smaller compared to GPT-3 which has 175 bilion parameters) as the basic model for fine-tuning, and encode the relevant documents prepared by pre-training and Q&A into vectors in the Elasticsearch database through SBERT for easy retrieval. When the user asks the bot a question, the results generated by the AI model and the results retrieved from the Elasticsearch database are compared, and a better result is selected as the final output and returned to the user.
(If you are interested in the Integration Method, the details of this method can refer at here: 9w.pw/IntegrationMethod )
The advantage of using the Integration Method is that it does not rely on third-party services, the technology stack used is completely open source, and the final project results can be completely open source without relying on OpenAI.
The disadvantage of course is that the results may not be as good as using OpenAI's GPT-3 model (after all, OpenAI have spent billions of dollars to train the model)
In the second step, Vue.js is used to develop a user interface as the front-end that allows users to interact with the AI model, PHP (chosen because Drupal uses PHP, other programming languages are ok for me) is used as the back-end to process the user input returned from the front-end, and the fine-tuned model from the previous step is called through Open AI's API interface to obtain the AI response and return it to the user.
Subsequent model iterations: After the project goes online, users will try to interact with the model, and we can use this interaction data to further fine-tune the original model using reinforcement learning and other methods to achieve better results. In this part, my preliminary plan is: after each user interaction with the robot, a scoring box will pop up in the front-end interface for the user to score the interaction, and we can select the lower scoring robot responses and retrain the AI model by manually writing new responses.
This is just my initial idea, any comments or suggestions are welcome to me!