The current version of the module (7.x-3.3) is broken, and the latest dev also has problems**
Moreover, while the general idea of the module is very good, it has a fundamental limitation: The fetched data are always based on a specific time report, which is set in the admin page.
But there might be cases where different datasets are required.
For example, if a site needs to have the most read:
- Articles of the last week
- Blog posts of the last month
those 2 views cannot be implemented with this module's data.
I propose a new version of the module, that will provide extended functionality (and also will fix the problems of the current version).
The new version will be based on the custom metrics/dimensions that Google Analytics provide. Instead of fetching a big bunch of GA results and trying to map them to drupal nodes, the google analytics module will be needed to sent the nid on every pageload event, so that google_analytics_counter can retrieve the data with the nid info already in them.
A first draft of the specs for the new version follows:
1) The module has dependency on google_analytics module.
2) Part of that dependency is to make sure that on every node pageload, the GAL event sends the (n)id, entity_type (node) and the bundle. Ga module already provides this functionality, but gac needs to make sure that ga is actually configured to do so.
3) The module defines different datasets. Each one is translated to a GAL query with:
- optional start date (fixed or relative)
- optional end date (fixed or relative)
- optional limit of data that will be fetched
- optional array of bundle names (node types)
- timestamp of the last successfull data fetch
4) The datasets are stored as entities with the following properties:
id, title, timestamp, query
5) Ctools plugins are used as wrappers to handle the GAL queries (translate from GAL query to human and vice versa)
6) The pageviews are stored on their own table with the following columns:
entity_type (= node by default), (n)id, dataset-id, pageviews
7) Each cron iteration fetches data from one dataset, which is selected with round-robin (see stored timestamp). It then divides the data to N chunks of nodes (configured on an admin page) and creates N+1 queue items. The +1 (first) queue job will clean up pageview rows that were not part of the returned data, and the rest N will make sure to update/create the relevant pageviews count. Thus, fetching the data and processing/storing them are done asynchronously.
8) Views integration
9) The admin page has configuration about:
- the number of nodes that are processed per cron run
- the caching of the data (how often they get fetched) in fixed-valued list
10) Future development could extend the module so that it also counts pageviews of other entities also (like taxonomy terms, or generic entities)
A big part of the code will be directly used from the current version.
Also, as part of the rewriting, we could implement it in a way that will make the D8 porting as easy as possible.
I'm posting this here as a first teaser. Please, feel free to share any of your ideas/comments/worries.
** References:
https://www.drupal.org/node/2794797 β
https://www.drupal.org/node/2756313 β
https://www.drupal.org/node/2788335 β