Batch Your DB Queries

Nawab Iqbal
2 min readJan 14, 2020

--

While working in a high level language, we get so carried away that we start to query our RDBMS in object abstractions. It works for the most part in case of simpler objects and that is where ORM has a place. However, as you start querying some complicated objects which have relationships to other objects like a tree, the query which should have retrieved an employee’s last-name, ends up querying his address, tasks, and so on. Similarly, with lists of objects, querying one by one becomes very expensive for very small count.

Recently, I was refactoring some code when I discovered multiple examples, which could be replaced with batch queries. Here is the chart when I replaced the individual queries with one batch query to the db. In some cases, one batch query replaced 100s of individual queries.

Batched vs Individual Object Retrieval. Even for 6 objects, batched query is 3 times faster than non-batched.

It didn’t affect the performance whether I wrote the batch query in Hibernate or Native SQL, the performance benefit was same. The only drawbacks in querying a list are

  1. More complicated parsing of the result set, especially if a 1:N join is involved.
  2. In some cases, you may need to define new data holder objects which only contain the data required for the specific scenario, since this data doesn’t naturally fit the existing Object model.

Here are my queries: The only difference is ‘=’ vs ‘IN’ clause.

Querying One Object (HQL)

SELECT bc.id FROM BusinessCategory bc JOIN BusinessTag bt on bt.businessCategory.id = bc.id JOIN BusinessTagLink btl on btl.businessTag.id = bt.id WHERE btl.business.id = :id ORDER BY btl.priority DESC

Query List of Objects (HQL)

SELECT btl.business.id, bc.id FROM BusinessCategory bc JOIN BusinessTag bt on bt.businessCategory.id = bc.id JOIN BusinessTagLink btl on btl.businessTag.id = bt.id WHERE btl.business.id in (:id) ORDER BY btl.business.id, btl.priority DESC

Both HQL and SQL (not shown here) performed similarly when querying in batch. In such cases, we don’t need to do a lot to increase the performance; the opportunity is cheap and in plain sight.

--

--

No responses yet