SQLAlchemy: Hierarchical Tables and Handling Large ResultSets

Yixuan Zhou
2 min readSep 11, 2020

There are tables that join with themselves, we call these tables self-referential or hierarchical tables.

  • contain a relationship with themselves
  • commonly found in organizational, geographic, network and graph

For example, manager is also an employee. The table has an undefined relationship between the id column and the manager column. In order to use this relationship in a query, we need a way to refer to this table by another name.

The alias() method allows us to do that by creating a way to refer to the same table with two unique names.

managers = employees.alias()

stmt = select([managers.columns.name.label(‘manager’), employees.columns.name.label(‘employee’)])

stmt = stmt.select_from(employees.join(managers, managers.columns.id == employees.columns.manager))

stmt = stmt.order_by(managers.columns.name)

print(connection.execute(stmt).fetchall())

We should have the table in the group_by() and the alias in the function or vice versa.

managers = employees.alias()

stmt = select([managers.columns.name, func.sum(employees.columns.sal)])

stmt = stmt.select_from(employee.join(managers, managers.columns.id == employees.columns.manager))

stmt = stmt.group_by(managers.columns.name)

print(connection.execute(stmt).fetchall())

Dealing with large ResultSets

  • fetchmany() lets us specify how many rows we want to act upon
  • we can loop over fetchmany()
  • It returns an empty list when there are no more records
  • We have to close the ResultProxy afterwards

#Expression is true and if so, it will continue running a loop

# Print the count by state
print(state_count)

→ {‘Illinois’: 172, ‘New Jersey’: 172, ‘District of Columbia’: 172, ‘North Dakota’: 75, ‘Florida’: 172, ‘Maryland’: 49, ‘Idaho’: 172, ‘Massachusetts’: 16}

--

--