Beginning with jOOQ 3.11, sort protected implicit JOIN
have been made out there, and so they’ve been enhanced to be supported additionally in DML statements in jOOQ 3.17. As we speak, I’d wish to deal with a considerably bizarre however actually highly effective use-case for implicit JOIN
, when becoming a member of further tables from inside an specific JOIN
‘s ON
clause.
The use case
The jOOQ code generator makes heavy use of jOOQ when querying the varied dictionary views. In PostgreSQL, most queries go to the SQL normal information_schema
, however now and again, the usual meta knowledge is inadequate, and we even have to question the pg_catalog
, which is extra full but additionally rather more technical.
For lots of information_schema
views, there exists an virtually equal pg_catalog
desk which accommodates the identical data. For instance:
information_schema |
pg_catalog |
---|---|
schemata |
pg_namespace |
tables or user_defined_types |
pg_class |
columns or attributes |
pg_attribute |
Apparently, PostgreSQL being an ORDBMS, tables and person outlined sorts are the identical factor and infrequently interchangeable within the sort system, however that’s a subject for a future weblog put up.
The purpose of this weblog put up is that always, when querying a view like information_schema.attributes
, we even have to question pg_catalog.pg_attribute
to get further knowledge. For instance, with the intention to discover the declared array dimension of a UDT (Person Outlined Sort) attribute, we have now to entry pg_catalog.pg_attribute.attndims
, as this data is nowhere to be discovered within the information_schema
. See additionally jOOQ function request #252, the place we’ll add assist for H2 / PostgreSQL multi dimensional arrays.
So, we would have a UDT like this:
CREATE TYPE u_multidim_a AS (
i integer[][],
n numeric(10, 5)[][][],
v varchar(10)[][][][]
);
The canonical SQL strategy to entry the pg_attribute
desk from the attributes
view is:
SELECT
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
pg_a.attndims
FROM information_schema.attributes AS is_a
JOIN pg_attribute AS pg_a
ON is_a.attribute_name = pg_a.attname
JOIN pg_class AS pg_c
ON is_a.udt_name = pg_c.relname
AND pg_a.attrelid = pg_c.oid
JOIN pg_namespace AS pg_n
ON is_a.udt_schema = pg_n.nspname
AND pg_c.relnamespace = pg_n.oid
WHERE is_a.data_type="ARRAY"
ORDER BY
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
is_a.ordinal_position
To visualise:
+----- udt_schema = nspname ------> pg_namespace | ^ | | | oid | = | relnamespace | | | v +------- udt_name = relname ------> pg_class | ^ | | | oid | = | attrelid | | | v is.attributes <-+- attribute_name = attname ------> pg_attribute
And now, we will see a couple of of our integration take a look at person outlined sorts, containing multi dimensional arrays:
|udt_schema|udt_name |attribute_name|attndims| |----------|------------|--------------|--------| |public |u_multidim_a|i |2 | |public |u_multidim_a|n |3 | |public |u_multidim_a|v |4 | |public |u_multidim_b|a1 |1 | |public |u_multidim_b|a2 |2 | |public |u_multidim_b|a3 |3 | |public |u_multidim_c|b |2 |
However take a look at all these JOIN
expressions. They’re positively no enjoyable. We have now to spell out all the path from pg_attribute
to pg_namespace
, solely to ensure we’re not fetching any ambiguously named knowledge from different UDTs or different schemata.
Utilizing implicit joins as an alternative
And that’s the place the facility of implicit JOIN
are available in play. What we actually need to write down in SQL is that this:
SELECT
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
pg_a.attndims
-- This desk we want
FROM information_schema.attributes AS is_a
-- And in addition this one
JOIN pg_attribute AS pg_a
ON is_a.attribute_name = pg_a.attname
-- However the path joins from pg_attribute to pg_namespace ought to
-- be implicit
AND pg_a.pg_class.relname = is_a.udt_name
AND pg_a.pg_class.pg_namespace.nspname = is_a.udt_schema
WHERE is_a.data_type="ARRAY"
ORDER BY
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
is_a.ordinal_position
It’s not that a lot shorter, nevertheless it’s positively very handy to now not have to consider how one can be a part of the completely different steps. Notice that in contrast to different instances, the place we used implicit joins by way of these paths in SELECT
or WHERE
, this time we’re utilizing them from inside a JOIN .. ON
clause! In jOOQ, we will write:
Attributes isA = ATTRIBUTES.as("is_a");
PgAttribute pgA = PgAttribute.as("pg_a");
ctx.choose(
isA.UDT_SCHEMA,
isA.UDT_NAME,
isA.ATTRIBUTE_NAME,
pgA.ATTNDIMS)
.from(isA)
.be a part of(pgA)
.on(isA.ATTRIBUTE_NAME.eq(pgA.ATTNAME))
.and(isA.UDT_NAME.eq(pgA.pgClass().RELNAME))
.and(isA.UDT_SCHEMA.eq(pgA.pgClass().pgNamespace().NSPNAME))
.the place(isA.DATA_TYPE.eq("ARRAY"))
.orderBy(
isA.UDT_SCHEMA,
isA.UDT_NAME,
isA.ATTRIBUTE_NAME,
isA.ORDINAL_POSITION)
.fetch();
The generated SQL appears barely completely different from the unique one, as jOOQ’s implicit JOIN
algorithm won’t ever flatten the JOIN
tree with the intention to protect any potential JOIN
operator priority, which is vital within the occasion of there being LEFT JOIN
, FULL JOIN
or different operators current. The output appears extra like this:
FROM information_schema.attributes AS is_a
JOIN (
pg_catalog.pg_attribute AS pg_a
JOIN (
pg_catalog.pg_class AS alias_70236485
JOIN pg_catalog.pg_namespace AS alias_96617829
ON alias_70236485.relnamespace = alias_96617829.oid
)
ON pg_a.attrelid = alias_70236485.oid
)
ON (
is_a.attribute_name = pg_a.attname
AND is_a.udt_name = alias_70236485.relname
AND is_a.udt_schema = alias_96617829.nspname
)
As you possibly can see, the “readable” desk aliases (is_a
and pg_a
) are the user-provided ones, whereas the “unreadable,” system generated ones (alias_70236485
and alias_96617829
) are those originating from the implicit JOIN
. And, once more, it’s vital that these implicit joins are embedded proper the place they belong, with the trail root pg_a
, from which we began the trail expressions. That’s the one method we will retain the right JOIN
operator priority semantics, e.g. if we had used a LEFT JOIN
between is_a
and pg_a
Future enhancements
Sooner or later, there is likely to be even higher JOIN
paths that permit for connecting such graphs straight, as a result of each time you must be a part of information_schema.attributes
and pg_catalog.pg_attribute
, you’ll must repeat the identical equalities on the (udt_schema, udt_name, attribute_name)
tuple, and whereas implicit JOIN
have been useful, it’s simple to see how this may be additional improved. The perfect question can be:
SELECT
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
pg_a.attndims
FROM information_schema.attributes AS is_a
-- Magic right here
MAGIC JOIN pg_attribute AS pg_a
ON jooq_do_your_thing
WHERE is_a.data_type="ARRAY"
ORDER BY
is_a.udt_schema,
is_a.udt_name,
is_a.attribute_name,
is_a.ordinal_position
However we’re not fairly there but.
Having access to these be a part of paths
Neither the information_schema
views, nor the pg_catalog
tables expose any overseas key meta knowledge, that are a prerequisite for implicit be a part of path expressions and different jOOQ code technology options. This isn’t an enormous downside as you possibly can specify artificial overseas keys to the code generator, for exactly this cause. See additionally our earlier weblog put up about artificial overseas keys for data schema queries. On this case, all we want is a minimum of this specification:
<configuration>
<generator>
<database>
<syntheticObjects>
<foreignKeys>
<foreignKey>
<tables>pg_attribute</tables>
<fields><discipline>attrelid</discipline></fields>
<referencedTable>pg_class</referencedTable>
</foreignKey>
<foreignKey>
<tables>pg_class</tables>
<fields><discipline>relnamespace</discipline></fields>
<referencedTable>pg_namespace</referencedTable>
</foreignKey>
</foreignKeys>
</syntheticObjects>
</database>
</generator>
</configuration>
And ta-dah, we have now our JOIN
paths as seen within the earlier examples.