Proposal 2023: Chatbot interface for Drupal documentation and contributor onboarding

Created on 3 February 2023, over 1 year ago
Updated 12 February 2024, 5 months ago

Project Mentor

yashn_99

Project Difficulty

INTERMEDIATE

Project Skills/Prerequisite

Basic knowledge of how Drupal.org's Drupal documentation is structured, between the user guide and the general wiki docs

Project Description

Develop a chat-bot style interface for asking common questions about Drupal, including how to learn Drupal basics, and also how to contribute to the Drupal community.

Importantly, we want to focus on the existing creative-commons licensed Drupal documentation, and if possible incorporate attribution by using the revision history to identify contributors to the docs, or at minimum ensure sources are cited. This is to address concerns about AI plagiarism in a responsible way, that still demonstrates the potential value of AI resources.

Expected Size of project

175 hours

Project Goal

  • To create a new interaction model for learning Drupal, and for learning how to participate in the Drupal community.
  • To evaluate the potential of AI systems as training and onboarding resources in Open Source.
  • And to demonstrate how AI resources can be used in an open source/creative commons context to address concerns about plagiarism in the training of AI models.

Project Resources

🌱 Plan
Status

Fixed

Component

Organization

Created by

🇺🇸United States hestenet Portland, OR 🇺🇸

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Issue created by @hestenet
  • 🇺🇸United States hestenet Portland, OR 🇺🇸
  • 🇮🇳India MilderBronze

    Is anyone working on this project? I want to work on this issue!

  • 🇧🇪Belgium Nick_vh Ghent

    Hi MilderBronze - please leave your thoughts here how you would approach this problem. Maybe an integration with ChatGPT, maybe something else? Note: This is very experimental, so excellent self-organization skills are required. There is not a lot of mentoring possible other than directional mentoring.

  • Hai, I was interested in this project, my approach would be as follows.
    First, choose the API(chatGPT for the time being)(is Cohere API an option?), then identify the common questions and the answers, for this to occur smoothly, a thorough understanding of the documentation is necessary. Then we have to train the model using the API by providing the question and answers. Then an interface has to be developed, for this are there any restrictions on libraries or frameworks we can use? Then the chatbot has to be tested and deployed! This is an overall approach.
    Won't this project take time, isn't it a 350hr one???

  • I am David Jennicson, a third-year engineering electronics and computer science student at Fr. Conceicao Rodrigues College of Engineering in Mumbai, India. I am passionate about artificial intelligence and wish to develop a chatbot using Drupal. I am currently working on hateful meme detection project with the usage of knowledge graph and due to this project I developed my interest in AI.
    My approach to this project is to compare the performance of existing open-source language models, such as GPT-2, DialogGPT, T5, BERT, and fine-tune them with data gathered from Drupal. I also require guidance on understanding the Drupal ecosystem and clarifying the mentors' expectations regarding the chatbot's performance. Do you require a dynamic conversational chatbot, or a simple one like google Dialog flow? Additionally, I intend to create a dashboard dedicated to the chatbot, where organizations can access data on conversation intent, audience (developers and clients), and sentiment analysis using React.

  • 🇮🇳India royalpinto007

    Hello @nick_vh, @yashn_99!

    My name is Royal, and I am currently a sophomore student at the National Institute of Technology Karnataka, Surathkal. I am thrilled to see the initiative to develop a chat-bot style interface for asking common questions about Drupal. I believe this will greatly benefit those who are new to Drupal and are looking for a quick and easy way to learn Drupal basics. Additionally, I appreciate the effort to incorporate attribution by using the revision history to identify contributors to the docs, which is a responsible approach to address AI plagiarism concerns.

    As a beginner, I too encountered certain difficulties when setting up Drupal on my local system, despite the availability of documentation and drupal-slack-support. However, I was able to overcome these obstacles. In such a scenario, a chatbot could provide prompt assistance.

    Apart from my situation, some examples of how a chatbot can be useful specifically for Drupal:
    - Troubleshooting: Can be programmed to provide assistance with Drupal troubleshooting, helping users resolve issues they encounter while setting up or managing a Drupal website.
    - Documentation: Provide quick and easy access to Drupal documentation, making it easier for users to find the information they need to resolve issues or complete tasks.
    - Training: Assistance with Drupal training, helping users learn how to use Drupal effectively and efficiently.
    - Support: Chatbots can provide support for Drupal-related issues, helping users resolve technical issues or errors they encounter while using Drupal.
    - Best Practices: Chatbots can provide guidance on Drupal best practices, helping users optimize their Drupal websites for performance, security, and user experience. (like I faced, when I was searching about the development checks that are present or established prior on Drupal, in order to fix coding standards or resolve phpcs error, etc)

    Some ideas to achieve the project goals could be:
    - Create an API that provides access to Drupal documentation and community resources. This API could be used to develop new chat-bot interfaces and other AI-powered resources that make it easier for users to learn Drupal and participate in the Drupal community.
    - Use Drupal APIs to create a platform that allows users to track their contributions to the Drupal community.
    - Develop an AI-powered chat-bot interface that can answer common questions about Drupal using Drupal APIs to retrieve and present relevant information from the Drupal community.

    As a novice in this field, I am fully aware that this project may require some guidance and mentorship, but I am eager to put in the effort and time required to learn and contribute to the project. I am highly motivated to gain new skills and knowledge that would enable me to make a meaningful contribution to the project's success.

    In case you need more information from me, I will definitely provide it. I am excited about the opportunity to work on this project and contribute to the Drupal community.

    I would like to propose that the project be categorized under the 350-hour limit, given its experimental nature and the potentially substantial amount of time required to work on it. By keeping the project within this scope, we can ensure that sufficient time is allocated to research and develop it thoroughly while maintaining a reasonable and achievable timeline.

  • Hii @nick_vh @yashn_99

    I have thought that we can make a book that contains All documentation of drupal with a feature that we can know api from which this contains taken(can be achieved by mapping of book pages with reference link). so that while giving a small answer by chatbot we can also share that documentation link reference so that for more information users can actually visit that api and take more info!.

    According to me Building our own deep learning model api will give us more control on data which is served to users and in future we can easily modify it.

    Please give you thought on this.

  • 🇮🇳India royalpinto007

    I've created a user interface for the project. Drawing inspiration from ChatGPT's graphics and icons, I've included similar elements to help users identify different conversation options. My UI design is distinct and tailored to the project's needs, aiming to make the conversation engaging and easy to navigate.

    I'm eager to receive feedback to refine and improve the design.

  • Royalpinto0007, I must say that I was truly impressed with the platform you have created. However, I noticed that there is one minor issue with the current setup - users have to open a new page whenever they need assistance. What if we were to embed the bot with a floating window instead?

    This approach would not only be easier to navigate, but also provide a more modern look and feel to the platform. In fact, I have taken the liberty of creating a UI design that showcases this approach (see attached).

    I wanted to present a fresh perspective and hear your thoughts on this idea. Please feel free to correct me if I am mistaken in any way.

  • 🇮🇳India royalpinto007

    @PratikSolanki
    I thought the same about building a floating window at first, but since this project involves providing extensive documentation and onboarding resources, I believe a full separate page for the chatbot interface would be more suitable.

    A full separate page would allow us to create a more immersive and engaging experience for users, with plenty of space to display information and integrate the creative-commons licensed Drupal documentation.

    Furthermore, a full separate page would enable us to use Drupal's built-in functionality to create custom templates and styles for the chatbot interface. This would allow us to design a unique and visually appealing interface that complements the rest of the website.

    In terms of attribution, using the revision history to identify contributors to the docs is a responsible and effective approach that can help ensure proper citation of sources.

  • Hi @nick_vh, @yashn_99!

    I recently came across the project description for developing a chat-bot style interface for Drupal, and I am extremely interested in participating in this project. Given my background in natural language processing (NLP) and my prior experience working on similar projects, I believe my skills could be a significant asset to the team.

    Allow me to introduce myself: my name is Yuntao, and I am currently pursuing a Master's degree in Computer Engineering in the United States. I have been involved in NLP projects in various capacities. For example, I worked as a research assistant at Dartmouth College where I led the development of a novel HCI approach to improve typing efficiency and user experience. This project involved model training, dataset construction, and experiment design. I also independently researched and built a medical dialogue system which aimed to act as a doctor and interact with patients through natural language. I compared and analyzed different retrieval-based and generation-based approaches to achieve optimal performance.

    Considering my experience in NLP and dialogue systems, I am eager to contribute to the Drupal chatbot project. I am excited by the opportunity to create a new interaction model for learning Drupal, evaluate the potential of AI systems in open source initiatives.

    Specifically, for this project, my initial idea was to fine-tune the OpenAI GPT-3 model by pre-processing the Drupal FAQ, documents, etc. to conform to the format needed for training the model, and then fine-tune the model by calling the OpenAI API to make the AI model capable of answering Drupal questions.
    (OpenAI Fine-tune referance:https://platform.openai.com/docs/guides/fine-tuning)

    However, for the model fine-tune in the first step, since all of the work should be FOSS, I have a alternative.
    A solution that combines both generation and retrieval (what I call the Integration Method).
    Use Seq2Seq or HRED or GPT-2 or DialogGPT open-source AI mini-models(I said "mini" at here is because these models' parameter is much more smaller compared to GPT-3 which has 175 bilion parameters) as the basic model for fine-tuning, and encode the relevant documents prepared by pre-training and Q&A into vectors in the Elasticsearch database through SBERT for easy retrieval. When the user asks the bot a question, the results generated by the AI model and the results retrieved from the Elasticsearch database are compared, and a better result is selected as the final output and returned to the user.
    (If you are interested in the Integration Method, the details of this method can refer at here: https://9w.pw/IntegrationMethod )
    (Related project repo on Github: https://github.com/xyTom/MERIT)

    The advantage of using the Integration Method is that it does not rely on third-party services, the technology stack used is completely open source, and the final project results can be completely open source without relying on OpenAI.
    The disadvantage of course is that the results may not be as good as using OpenAI's GPT-3 model (after all, OpenAI have spent billions of dollars to train the model)

    In the second step, Vue.js is used to develop a user interface as the front-end that allows users to interact with the AI model, PHP (chosen because Drupal uses PHP, other programming languages are ok for me) is used as the back-end to process the user input returned from the front-end, and the fine-tuned model from the previous step is called through Open AI's API interface to obtain the AI response and return it to the user.

    Subsequent model iterations: After the project goes online, users will try to interact with the model, and we can use this interaction data to further fine-tune the original model using reinforcement learning and other methods to achieve better results. In this part, my preliminary plan is: after each user interaction with the robot, a scoring box will pop up in the front-end interface for the user to score the interaction, and we can select the lower scoring robot responses and retrain the AI model by manually writing new responses.

    This is just my initial idea, any comments or suggestions are welcome to me!

  • 🇺🇸United States x-Tom

    Hi @nick_vh, @yashn_99!

    I recently came across the project description for developing a chat-bot style interface for Drupal, and I am extremely interested in participating in this project. Given my background in natural language processing (NLP) and my prior experience working on similar projects, I believe my skills could be a significant asset to the team.

    Allow me to introduce myself: my name is Yuntao, and I am currently pursuing a Master's degree in Computer Engineering in the United States. I have been involved in NLP projects in various capacities. For example, I worked as a research assistant at Dartmouth College where I led the development of a novel HCI approach to improve typing efficiency and user experience. This project involved model training, dataset construction, and experiment design. I also independently researched and built a medical dialogue system which aimed to act as a doctor and interact with patients through natural language. I compared and analyzed different retrieval-based and generation-based approaches to achieve optimal performance.

    Considering my experience in NLP and dialogue systems, I am eager to contribute to the Drupal chatbot project. I am excited by the opportunity to create a new interaction model for learning Drupal, evaluate the potential of AI systems in open source initiatives.

    Specifically, for this project, my initial idea was to fine-tune the OpenAI GPT-3 model by pre-processing the Drupal FAQ, documents, etc. to conform to the format needed for training the model, and then fine-tune the model by calling the OpenAI API to make the AI model capable of answering Drupal questions.

    However, for the model fine-tune in the first step, since all of the work should be FOSS, I have a alternative.
    A solution that combines both generation and retrieval (what I call the Integration Method).
    Use Seq2Seq or HRED or GPT-2 or DialogGPT open-source AI mini-models(I said "mini" at here is because these models' parameter is much more smaller compared to GPT-3 which has 175 bilion parameters) as the basic model for fine-tuning, and encode the relevant documents prepared by pre-training and Q&A into vectors in the Elasticsearch database through SBERT for easy retrieval. When the user asks the bot a question, the results generated by the AI model and the results retrieved from the Elasticsearch database are compared, and a better result is selected as the final output and returned to the user.
    (If you are interested in the Integration Method, the details of this method can refer at here: 9w.pw/IntegrationMethod )

    The advantage of using the Integration Method is that it does not rely on third-party services, the technology stack used is completely open source, and the final project results can be completely open source without relying on OpenAI.
    The disadvantage of course is that the results may not be as good as using OpenAI's GPT-3 model (after all, OpenAI have spent billions of dollars to train the model)

    In the second step, Vue.js is used to develop a user interface as the front-end that allows users to interact with the AI model, PHP (chosen because Drupal uses PHP, other programming languages are ok for me) is used as the back-end to process the user input returned from the front-end, and the fine-tuned model from the previous step is called through Open AI's API interface to obtain the AI response and return it to the user.

    Subsequent model iterations: After the project goes online, users will try to interact with the model, and we can use this interaction data to further fine-tune the original model using reinforcement learning and other methods to achieve better results. In this part, my preliminary plan is: after each user interaction with the robot, a scoring box will pop up in the front-end interface for the user to score the interaction, and we can select the lower scoring robot responses and retrain the AI model by manually writing new responses.

    This is just my initial idea, any comments or suggestions are welcome to me!

  • 🇧🇪Belgium Nick_vh Ghent

    I'm removing myself as mentor but will leave it to the good hands of yashn_99!

  • 🇮🇳India giteshsarvaiya

    Hello @yashn-99,
    Is there anybody working on this ?

    Idea:
    We can use gpt-3.5 turbo ans fine tune it with multiple times with jaon based datasets datasets.
    (https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset)

    And then after incremental fine tuning, the bot will be ready and the bots backend will be ready.
    I am thinking of using nextjs(for frontend) and expressjs(for backend) to create the bot.

    What do you think?

  • Status changed to Fixed 5 months ago
  • 🇮🇳India Shriaas Pune

    Hi @giteshsarvaiya this project was completed during GSoC 2023, here is the submission link: https://medium.com/@royalpinto007/work-report-google-summer-of-code-2023-journey-with-drupal-146e91dd88fc.
    Marking this issue as fixed.

  • Status changed to Fixed 5 months ago
Production build 0.69.0 2024