In this blog post I will explain the differences between OLTP & OLAP and how should we configure SQL Server and its workload in both the cases.
OLTP (Online Transaction Processing)
OLTP is designed to serve as a persistent data store for business or front-end applications. OLTP administers day to day transaction of an organization. They excel at quickly looking up specific information as well as processing a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF).
Examples of some common requirements need OLTP:
- What are the active orders of the current users?
- What is the name of the current user, when given an email address?
- What was the last stage that a player was on a mobile game?
- What is current billing addresses for a set of clients?
These type of requirements need a system that can look up and update one or more columns within one or many rows. The strength of OLTPs is that they support fast writes and fast data retrieval. A typical workload for OLTP is both frequent reads and writes, but the reads tend to be more of looking up a specific value rather than scanning all values to compute an aggregate.
Example of OLTP system
An example of OLTP system is ATM center. Assume that a couple has a joint account with a bank. One day both simultaneously reach different ATM centers at precisely the same time and want to withdraw total amount present in their bank account.
However, the person that completes authentication process first will be able to get money. In this case, OLTP system makes sure that withdrawn amount will be never more than the amount present in the bank. The key to note here is that OLTP systems are optimized for transactional superiority instead data analysis.
Other examples of OLTP system are:
- Online banking
- Online airline ticket booking
- Sending a text message
- Order entry
- Add a book to shopping cart
Benefits of OLTP method
- Provides ability to faster time to market
- Serves daily transactions of an organization.
- Widens the customer base of an organization by simplifying individual processes.
OLAP (Online Analytical Processing)
In contrast to an OLTP database, an OLAP database is designed to process large datasets to answer questions about data. OLAP is a combination of software tools which provide analysis of data for business decisions. OLAP systems allow users to analyze database information from multiple database systems at one time.
It is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations of large data sets. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).
- Star schema: In computing, the Star Schema is the simplest style of data mart schema. The star schema consists of one or more fact tables referencing any number of dimension tables. The star schema is an important special case of the snowflake schema, and is more effective for handling simpler queries.
Examples of some common requirements need OLAP:
- What is the most selling product of website?
- What is the demand trend of a product?
- Setting up the optimal prices as per demand?
- What is the customer lifetime value of an e-commerce application?
- What is the median duration of play time on a mobile game?
- What is the conversion rate for various landing pages based on the referrer?
An OLAP database is optimized for scanning and performing computations across many rows of data for one or multiple columns. Since OLAP is optimized for analyzing data, basic transactional procedures like writes or updates tend to be done in infrequent batches, typically once a day or an hour. OLAP shines when it comes to reads and analytical calculations like aggregation.
Example of OLAP
Any Data warehouse system is an OLAP system. Uses of OLAP are as follows
- A company might compare their mobile phone sales in September with sales in October, then compare those results with another location which may be stored in a separate database.
- A retail company analyzes purchases by its customers to come up with a personalized homepage with products which likely interest to their customer.
- An airline may analyse their busiest routes during summers and winters and schedule their flights accordingly
- A restaurant may compare their sales during different seasons and can control their food supply as per demand.
Difference between OLTP and OLAP
Parameters | OLTP System | OLAP System |
Source of data | Operational data; OLTPs are the original source of the data. | Consolidation data; OLAP data comes from the various OLTP Databases |
Purpose of data | To control and run fundamental business tasks | To help with planning, problem solving, and decision support |
What the data | Reveals a snapshot of ongoing business processes | Multi-dimensional views of various kinds of business activities |
Inserts and Updates | Short and fast inserts and updates initiated by end users | Periodic long running batch jobs refresh the data |
Queries | Relatively standardized and simple queries Returning relatively few records | Often complex queries involving aggregations |
Processing Speed | Typically very fast | Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes |
Space Requirements | Can be relatively small if historical data is archived | Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP |
Database Design | Highly normalized with many tables | Typically de-normalized with fewer tables; use of star and/or snowflake schemas |
Backup and Recovery | Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability | Although regular backups are always a safer option, but some environments may consider simply reloading the OLTP data as a recovery method |
OLTP is a data modeling approach typically used to facilitate and manage usual business applications. Most of applications you see and use are OLTP based.
OLAP is an approach to answer multi-dimensional queries. OLAP was conceived for Management Information Systems and Decision Support Systems.
With the constant growth of data analysis and business intelligence applications (now even in small business) understanding OLAP nuances and benefits is a must if you want provide valid and useful analytics to management.
The following table will summarize main technical differences between OLTP and OLAP:
Parameters | OLTP System | OLAP System |
Application | Operational: ERP, CRM, legacy apps, … | Management Information System, Decision Support System |
Typical users | Staff/Customers | Analyst/Business users |
Horizon | Weeks, Months | Years |
Refresh | Immediate | Periodic |
Data model | Entity-relationship | Multi-dimensional |
Schema | Normalized | Star |
Emphasis | Update/retrieval | Retrieval |
Workload distribution on OLTP and OLAP
SQL Server Data Warehouse (DW) workloads differ in significant ways from traditional Online-Transaction-Processing (OLTP) workloads in how they should be performance tuned because of the different query patterns inherent in both designs. Below is a table which summarizes some of the most important differences.
An OLTP workload is characterized by seek centric operations that are best measured by IOPS (In-and-Out-Operations-Per-Second). The typical OLTP query only requires a few rows from several tables (3rd normal form) to be returned ideally from memory and rarely from disk. An OLTP system’s performance is determined by how quickly it can seek those few rows and return them back to the application. However, a typical data warehouse query will return a lot of historical data from disk to be aggregated on the CPU. For example to get the Total Sales Amount for an entire year all of the sales transactions for that year will have to move from disk onto the CPU to be aggregated to the yearly level. This large data movement requires scan centric operations.
End users and applications update the OLTP system through DML (Data Manipulation Language: updates, inserts, and deletes) operations continuously throughout the day so that data is described as being volatile while a data warehouse typical only preforms the bulk INSERTS (or potentially UPDATES and DELETES) operations during a daily ETL process.
In order for an OLTP system to quickly locate the rows that it requires many indexes will need to be created, however data warehouse tables do not require nearly as many indexes. In the example of yearly Total Sales Amount an index will help located all the transactions for the entire year but will not be able to improve the time that it takes to move all of those transaction from disk to the CPU. Therefore, only relatively few indexes are required.
OLTP systems are also characterized by high concurrency because many users or applications can be querying the system at the same time. Data warehouses, however, normally only support a few users who are typically characterized as analysts.
Fine tune your workload on OLTP and OLAP
Following are the best practices while configuring your SQL Server on an OLTP and OLAP:
OLTP System | OLAP System | |
SQL Server Indexes | Try to cover queries with supporting indexes to run faster. | Try to cover queries minimally as it will impact bulk loading of data significantly. In most of the cases analysis processes read full table. |
Parallelism in SQL Server Plan | Try to restrict the parallel queries and ensure non-parallel SQL plan is being used | A parallel plan is preferred |
Preferred Physical Join | Nested Loop | Hash Join |
Maximum Degree of Parallelism | As small as it can be- as OLTP serves high concurrent transactions, we need to make sure CPU is available for other concurrent transactions so one process cannot take the whole CPU. | As high as it can be- as OLAP does not serve high concurrent workload so we should utilize as many CPU as we can during one bulk transaction. |
SQL Transactions | Keep it small as it can be. | Keep it big as it can be. |
Primary Key | It should be created on business unique column so that joins and data filter can extract data faster. | Surrogate keys are database keys used to relate the dimension tables to the fact tables. Surrogate keys (SK) have no meaning to the business and no intrinsic meaning. |
Note: These are the best practices and can vary environment to environment so please configure your SQL Sever as per your requirements.
Please leave your feedback in comment section.
Cheers.
excellent article…good tips for deciding index strategy as per the workload
LikeLike
Very well explained! Keep sharing 👍
LikeLike
Nicely explained. Best part is sql server configuration customisation
LikeLike
Very well explained… very informative.. 👍
LikeLike
Nicely explained with key differences. Great work.
Thank you
LikeLike
Thanks Chandan for your feedback.
LikeLike
There is still a lot of misconception about OLAP and OLTP workload so I have tried to explain both the type of workloads as per my professional experience. I am glad you found it worth reading.
LikeLike
Great tutorial! Thanks for that!
LikeLike
GOOD POST
LikeLike
A very good article. Clearly explains the concepts.
LikeLike