SQL Tells a Human Story
You can learn a lot about an organization by reading through it’s SQL files. These files contain the map of how an organization works. By reading through them, you can learn how systems come together to make a business run. You can learn what’s working, and what’s not.
If you’ve ever come across an overly complicated, too-long, SQL file, riddled with one-off conditions, it can feel tempting to burn it down and start over. But, please don’t. In that SQL lies an unfiltered picture of how your organization really functions, beyond clean architecture diagrams and documented operational processes.
As a detail oriented data nerd, I’ve been known to frequently explain “the devil is in the details” and that couldn’t be more true for our SQL files. Each line represents a piece of logic that tells us something about how our organization functions. In this article I’ll walk through some examples of common SQL patterns and what you can learn from them. I’ll then outline how you can conduct your own SQL forensics to systematically understand and address organizational patterns.
Common SQL Patterns and Human Stories
Record Cleanup
Record specific adjustments are a common SQL gotcha. The purpose of the query below is to simply calculate the sum of customer spend by geography and month. However, something appears to be wrong with a number of specific transactions, as they are omitted from the total calculation. The reason for this could be duplicate records, test transactions, non-billable sales, or any number of other edge cases. The main problem is that the data is incorrect in the system and now we are using SQL bandaids to clean it up on the reporting layer.
When you encounter statements like this, ask around about why this happened. What was the human exchange that made these lines of code come to be? How were these erroneous records discovered? Was there a reason that the system team did not clean the records at the source? Did the data team feel empowered to ask the system team to clean the records up? This is the kind of scenario you need to get to the bottom of to understand teaming, communication and operational dynamics that may need to change.
System Gaps
Another pattern that I’ve seen quite often is an ongoing use of IF or CASE statements to derive a critical business value based on domain knowledge. In the example below, we are querying the customer table to get the total spend by customer. Additionally, we want to understand if the accounts are internal or external. In this example internal accounts are owned by someone inside the company and used for business purposes. Their spend will likely be reimbursed via discounted departmental charge back. External accounts are owned by someone outside of the company who will pay for their services with real dollars. The code below shows that Acme corp does not have a way of systematically identifying internal accounts and therefore relies on a best effort mix of pattern matching and domain knowledge.
It’s important to take the time to understand why these case statements are here and what the broader implications are. This particular case statement means that Acme corp has no way to separately manage internal accounts. This is a business limitation that could have a broader impact on customer payments, onboarding, support and communications. Likely all of these business functions have encountered challenges with this limitation and are trying to create separate, hand crafted flows to account for the missing feature. This leads to inefficient and error prone implementations of a core concept.
On the human side, if there are too many system bandaids in a particular area we have to wonder if there is a communication gap between the business and system owners. We need to ask, why none of the dependent systems made this requirement known to the system teams. Or, was this gap known but not prioritized and if so why?
Code Ownership
Common Table Expressions (CTEs) are powerful because you can chain SQL statements together to build your logic in a comprehensive flow. Unfortunately, sometimes CTE’s are a little too handy and can result in very long SQL expressions with chained CTEs. If your company has a critical SQL file and they need to make an addition to the functionality, it can be easy to just add another CTE at the end of the query. The issue with this approach is that the SQL keeps growing and over time can cause performance and readability to suffer.
In the oversimplified example below we can see that the middle CTE “CUSTOMERS” is not even used in this SQL. Of course it’s unlikely this would happen in real life with just two CTEs, but in real life it’s not uncommon to see 10, 20, 30 or more CTE’s chained together before the final query. Over time, the individual CTE’s can become inefficient, unnecessary or redundant.
In these cases, we need to ask how our code is managed and reviewed. What process do we have for making changes to core SQL? Do we have anyone who understands the end to end of this SQL? Or has it been passed around, each custodian putting on a little bandaid and moving on to the next issue? It’s important to understand these type of issues as they reveal a lot a teams operational model and areas of risk.
Conduct your own SQL Forensics
If any of these scenarios sound familiar to you, it might be time to start conducting your own SQL forensics. While there is no perfect method to the task, I’ll share a method that I’ve seen to work in the past.
Step 1: Pick an important and problematic SQL file
Find a SQL file that you know to be overly complicated and in high demand. This should be a file of consequence to your organization so that you can be sure issues are not due to lack of importance or use.
Step 2: Walk through the SQL file and make note of anything the looks like it could be done a better way
This part takes time and patience. Grab your favorite drink or snack, put on some music and start reading! Anything that looks weird, ask yourself why this was done and if there could be a better way. Create a log of all oddities and make note of the following:
Code line number
Function purpose
Reason for review
General action that could be taken
Step 3: Review your notes to identify themes
Review your notes from step 2 to identify general themes of issue reasons and possible actions. Group the issues into larger action buckets like “Work with the system teams to fix”, “Fix the code” etc.
Step 4: Put issues into the appropriate backlogs
Create a plan for each action bucket. Work with your partners to get these issues prioritized and ticketed in the appropriate systems.
Step 5: Protect against these issues in the future
Talk with your team about the themes found and the impact that they had. Discuss how you can avoid issues like this in the future and identify some guiding principles that you will follow. For example, make a rule that no one-off record cleanup requests will be done without approval from management.
Step 6: Celebrate your hard work
Going through your SQL files in such detail is a lot of work. But no doubt, if you’ve finished this exercise, you’ve learned a lot about your organization and the systems and processes used within it. Take a last moment to celebrate your hard work.
Thank you
Thank you for reading through this article and thinking through what you could learn by understanding the human stories in your SQL. Please feel free to reach out with any questions!