SQL Query Optimization: Avoid Cartesian Join

Created on 12 July 2024, 5 months ago

Problem/Motivation

The unnecessary use of Cartesian joins leads to particularly slow queries with large populated tables, where each row of each table might be matched with every row of every other table during the execution the query. In particular, this might occur when a full scan of the join happens looking for something that _does not_ occur in the tables (looking for an entity type that is not tracked, for example).

Two known locations performing Cartesian joins are:

There may be others; however, in the case of these

Steps to reproduce

Perform the queries separately, and see that the set of columns returned contains the columns of all tables under the Cartesian join.

Proposed resolution

Rework process and/or queries to avoid Cartesian joins. Given the two queries in question are related exclusively to the existence of rows matching particular conditions, we could move to perform multiple individual queries, and return the disjunction of all of them. Alternatively, reworking the queries to do the separate queries UNION ALL'd together might do the trick in one.

Remaining tasks

- get code into MR
- review
- merge

User interface changes

None.

API changes

None.

Data model changes

None.

📌 Task
Status

Active

Version

2.0

Component

Code

Created by

🇨🇦Canada adam-vessey PE, Canada

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024