Businesses need answers to questions that support or inform decisions. Those answers are best served by supporting data, and that data comes from queries against available data stores. The quicker we can supply those answers, the more effective managers can be in making those critical business decisions.
How do we improve the performance of the queries providing those answers? Let’s look at the process, briefly.
We collect more data today than ever before in history. We store that data in a variety of ways, from traditional relational databases to blockchain technologies. The one thing that’s consistent in improving query performance is the need to minimize the amount of data we need to read to return the results. The more data we have to read, the longer the query takes.
How do we minimize the amount of data we need to read? We can do that by organizing the data in ways that lend themselves to rapid response to the questions businesses need to answer. This is where things can get interesting. A good data architect is critical to the process of organizing the data structures so that the business questions can be most readily answered.
Do you need to be able to look at sales results by region by quarter? If so, you probably want a pre-aggregated dataset (a “cube”) so you can explore the various ways to view the data. Do you need to need to know where a particular package is in your delivery system right now? In this case you need a more direct approach, looking at the vehicle tracking data real-time.
In both of these cases, and in many more, you need to minimize the amount of data you need to sift through to get the results you’re looking for. If you can put the data in a form that minimizes the amount of data you need to read, your results will be returned much faster.
Now, to get more specific, you can define your data to maximize the number of rows that fit on a single data block or page, you can define covering indexes that contain the data attributes in your query, and you can define analytical structures that pre-aggregate the data most likely to be needed by the business, but the root solution is to minimize the amount of data your query needs to read to provide the results required.
When you start working on query tuning, look to minimize the data you need to read.