[META] Proposal: Track ##: Telemetry

Created on 7 November 2024, 5 months ago

Summary

Telemetry is a crucial part of modern software development to provide information about how real-world users interact with a software application. Drupal has not integrated a formal telemetry system in the past, but Drupal CMS is a great opportunity both to try a telemetry system, and to take advantage of the insight it provides to rapidly improve the product.

We are used to industry standards for anonymized telemetry collection in most of the software we use today, whether open source or proprietary - including web browsers like Mozilla Firefox, our phone and desktop operating systems, and many apps and desktop software packages.

Work to be done (in scope)

In this track, we propose to:

  • Collaborate with other Starshot core and track leads about the insights that would be most valuable.
  • Identify what features should be instrumented
  • Identify additional features we would like to have, like:
    • net-promoter-score collection
    • user surveys
    • optional, opt-in data gathering that is not anonymized, such as:
      • website domain
      • contact email for a site administrator.
      • And perhaps more
  • Establish privacy and data management standards for use of telemetry data - in partnership with the Privacy Track team, especially.
  • Identify how telemetry can help support proposals by other tracks to offer integrations with third party providers (AI, SEO, Analytics, etc) - and how that might impact partnership and revenue opportunities to support the Drupal Association.
  • Identify both commercial and open source telemetry providers.
  • Design the opt-out process for collecting anonymized telemetry, following the standards of other common software packages and open source projects (such as Mozilla Firefox).

Out of scope for official launch

  • Inclusion of Telemetry collection in Drupal core
  • Decision on how much aggregate/anonymized data can be made public, vs reserved for project leadership. (Default to kept private to project leadership).

Target milestone

Ideally the first implementation of Telemetry is available in the alpha/beta/RC phase of Starshot 1.0 - so that the initial insights can be acted upon for the 1.0 release.

Having the Telemetry system integrated with the 1.0 release would allow rapid iteration in the next 1.* versions.

However, this is not a blocking priority for release, so it could be a fast follow in a 1.* version.

Skills required

  • Expertise in Telemetry collection systems
  • Expertise in telemetry analysis to drive feature development
  • User testing design (to the extent that a telemetry tool can be configured to do distributed user testing experiments)
  • Drupal integration (with the third party telemetry suite)
  • Drupal.org integration (if needed, depending on the solution telemetry suite)

Blockers/Dependencies

The solution must:

  • Have a clearly defined privacy and data protection standard.
  • Comply with all international regulation on data collection
  • Only be opt-out for aggregate/anonymized data
  • Only be opt-in for optional, identifiable data such as domain name (not PII) or site admin email address (PII)

The telemetry software suite must:

  • Be able to instrument all of the kinds of features, user testing, nps, surveying, etc that we require.
  • Be able to quickly deploy tracking adjustments so that the focus of the data collection can be changed as Drupal CMS development focus changes
  • Be affordable (or free) for the Drupal Association to maintain on behalf of the community

Our preference for a telemetry suite is:

  • A third party service provider willing to partner with us, who can manage and safeguard the telemetry data, provide access to the data in a principle of least privilege way, and handle the complex issue of international data privacy compliance.
  • A self-hosted/open source telemetry solution is still on the table, but may require more time, and expertise not currently on the internal DA team.

Track lead

TBD - currently being lead by a taskforce of:
@dries, @hestenet, @lenny-moskalyk, @pixelite, @laurii, @lewisnyman
With intent to recruit to the team:

  • A PM to take the 'lead' role
  • A lead engineer
  • A user experience reviewer
  • A privacy expert (from the Privacy track)
  • A user testing expert (perhaps the same person)
  • Marketing experts (from DA/community)

Proposal

The Telemetry solution should be integrated and enabled during the installer - so that it can instrument and report on the user experience from the very beginning of installing Drupal CMS.

One of the steps of the installer should look like one of these established patterns:

Firefox

Mac OS

From there, the telemetry should instrument the following features (this is not a complete or final list):

  • What buttons/links etc the user is clicking on
  • What admin paths the user has navigated to
  • Which forms have been started/completed
  • Indicators of confusion/difficulty in navigation, such as:
    • Multiple rage clicks
    • Multiple clicks and back button uses
    • Etc
  • What recipes are applied
  • What modules, themes, recipes, etc are chosen in the Project Browser
  • What configuration changes are made
  • Basic environment information:
    • Drupal version
    • Hosting env info
  • Some sub-set of error messages/warnings/watchdog messages
  • Session replay capability - to be able to view the actual use of a page over a period of time (effectively, distributed, anonymized user testing).
  • Perhaps more

Optional, opt-in only collection should include:

  • Domain name
  • Opt in to product news
  • Provide name, email address for contact
  • Perhaps more

Roadmap

  1. Validate this track proposal with the initial track team, including representatives from other tracks we want to consult.
  2. Determine our privacy and data protection policy
  3. Select a vendor or open source telemetry suite
    • This includes evaluating privacy protection, features, cost, etc.
  4. Design the opt-out/opt-in ui on install, and the admin area for post-install changes.
  5. Determine if and how we need to use middleware through D.O to manage/update config.
  6. Integrate the telemetry suite with DrupalCMS in a feature branch.
  7. Experiment with instrumenting features.
  8. Determine an initial instrumented test path for the first round of data gathering.
  9. Merge into the main branch in time for alpha/beta/rc
  10. Set a cycle for updating the instrumented experiments, summarizing the data, and informing the future roadmap
🌱 Plan
Status

Active

Component

General

Created by

πŸ‡ΊπŸ‡ΈUnited States hestenet Portland, OR πŸ‡ΊπŸ‡Έ

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @hestenet
  • πŸ‡ΊπŸ‡ΈUnited States hestenet Portland, OR πŸ‡ΊπŸ‡Έ
  • πŸ‡¬πŸ‡§United Kingdom catch

    From the issue summary, this could be proposing one of two things which are very different, and it's not clear to me which one it is:

    1. Telemetry of Drupal CMS installations and admins themselves - e.g. which recipes get installed by project browser or similar things that would be an extension to what update status already does. Or possibly exposing some of the existing update status data that is not available on Drupal.org (like which core modules are installed) via a page on d.o

    2. Telemetry of the visitors to Drupal CMS sites which to me has extremely grave privacy/GDPR implications since the site owner would need to somehow indicate they are sharing their own users' information with a third party (the Drupal Association).

    I think I saw some kind of draft somewhere about this, and that also was not clear whether it was talking about #1 or #2 but seemed to suggest it could be both.

    Obviously firefox or Mac OS telemetry is only able to do #1 since those are assumed to be single-user installations, this is not the case for a website.

  • πŸ‡ΊπŸ‡ΈUnited States hestenet Portland, OR πŸ‡ΊπŸ‡Έ

    My intent was for it to be scoped to #1 in this summary

  • πŸ‡«πŸ‡·France andypost

    I think there's a mix of terminology - instrumentation is a way to use "open telemetry hooks" for function calls, but what I see in summary is looks more like Metrics a-la collectors in contrib modules webprofiler β†’ or o11y

  • πŸ‡¬πŸ‡§United Kingdom catch

    @hestenet I'm not sure how these can be reliably limited to site owners, if that's the intention, they should be descoped.

    What admin paths the user has navigated to
    Which forms have been started/completed
    Indicators of confusion/difficulty in navigation, such as:
    Multiple rage clicks
    Multiple clicks and back button uses
    Etc

    ...

    Session replay capability - to be able to view the actual use of a page over a period of time (effectively, distributed, anonymized user testing).

    On a more basic level, because there's no drupal_cms module or install profile, it's not clear to me that usage of drupal_cms will be reported to update status/d.o at all - that might be a more urgent thing to figure out.

  • πŸ‡¦πŸ‡²Armenia murz Yerevan, Armenia

    Actually, OpenTelemetry can submit not only traces but also metrics and logs. In the recent release of the Drupal OpenTelemetry module β†’ , I added a submodule "opentelemetry_metrics" that can submit metrics. So, maybe we can reuse this module to get the needed telemetry data?

    To not insert into the Drupal Core some telemetry-specific things, we can apply auto-instrumentation https://github.com/open-telemetry/opentelemetry-php-instrumentation and hook the needed functions in the wrapper. But the integration of the auto-instrumentation into the Drupal OpenTelemetry contrib module is still ongoing, see ✨ Support auto-instrumentation Active .

  • πŸ‡¦πŸ‡ΊAustralia pameeela
Production build 0.71.5 2024