SQL Server — Identifying Empty Tables

melbdataguy
1 min readSep 21, 2023

--

In the realm of data engineering, one critical task involves running data pipelines and ensuring their successful execution. A crucial aspect of this process is verifying the integrity of the data in the destination tables after a pipeline run. Often, data engineers need to confirm that tables have been populated as expected, and this typically involves checking the row count in each table. However, performing individual SELECT COUNT(*) queries for every table can be time-consuming.

Fortunately, there’s a more efficient solution: a simple SQL query that quickly identifies empty tables in SQL Server.

SELECT 
s.Name + '.' + t.name AS table_name,
(select sum(p2.rows)
from sys.indexes i2 inner join sys.partitions p2 ON i2.object_id = p2.OBJECT_ID AND i2.index_id = p2.index_id
where i2.object_id = t.object_id and i2.object_id > 255 and (i2.index_id = 0 or i2.index_id = 1)
) as total_rows
FROM
sys.tables t
INNER JOIN
sys.schemas s ON s.schema_id = t.schema_id
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.is_ms_shipped = 0 AND i.OBJECT_ID > 255
GROUP BY
t.object_id, t.Name, s.Name
ORDER BY
total_rows DESC

This query efficiently summarises the tables with no rows, simplifying the process of data pipeline verification in your data engineering workflows. Moreover, it can easily be adapted to check for empty tables within specific schemas by modifying the WHERE filter accordingly, allowing for greater flexibility in monitoring and managing your data pipelines.

--

--

melbdataguy

Just a curious data engineer based in Melbourne | Self-learner