Monday, February 21, 2011

Overview of Analytic SQL

Oracle has introduced many SQL operations for performing analytic operations in the database. These operations include ranking, moving averages, cumulative sums, ratio-to-reports, and period-over-period comparisons. Although some of these calculations were previously possible using SQL, the new syntax offers much better performance.
This section discusses:

SQL for Aggregation

Aggregation is a fundamental part of data warehousing. To improve aggregation performance in your warehouse, Oracle provides extensions to the GROUP BY clause to make querying and reporting easier and faster. Some of these extensions enable you to:
  • Aggregate at increasing levels of aggregation, from the most detailed up to a grand total
  • Calculate all possible combinations of aggregations with a single statement
  • Generate the information needed in cross-tabulation reports with a single query
These extension let you specify exactly the groupings of interest in the GROUP BY clause. This allows efficient analysis across multiple dimensions without performing a CUBE operation. Computing a full cube creates a heavy processing load, so replacing cubes with grouping sets can significantly increase performance. CUBE, ROLLUP, and grouping sets produce a single result set that is equivalent to a UNION ALL of differently grouped rows.
To enhance performance, these extensions can be parallelized: multiple processes can simultaneously run all of these statements. These capabilities make aggregate calculations more efficient, thereby enhancing database performance, and scalability.
One of the key concepts in decision support systems is multidimensional analysis: examining the enterprise from all necessary combinations of dimensions. We use the term dimension to mean any category used in specifying questions. Among the most commonly specified dimensions are time, geography, product, department, and distribution channel, but the potential dimensions are as endless as the varieties of enterprise activity. The events or entities associated with a particular set of dimension values are usually referred to as facts. The facts might be sales in units or local currency, profits, customer counts, production volumes, or anything else worth tracking.
Here are some examples of multidimensional requests:
  • Show total sales across all products at increasing aggregation levels for a geography dimension, from state to country to region, for 1999 and 2000.
  • Create a cross-tabular analysis of our operations showing expenses by territory in South America for 1999 and 2000. Include all possible subtotals.
  • List the top 10 sales representatives in Asia according to 2000 sales revenue for automotive products, and rank their commissions.
All these requests involve multiple dimensions. Many multidimensional questions require aggregated data and comparisons of data sets, often across time, geography or budgets.



SQL for Analysis

Oracle has advanced SQL analytical processing capabilities using a family of analytic SQL functions. These analytic functions enable you to calculate:
  • Rankings and percentiles
  • Moving window calculations
  • Lag/lead analysis
  • First/last analysis
  • Linear regression statistics
Ranking functions include cumulative distributions, percent rank, and N-tiles. Moving window calculations allow you to find moving and cumulative aggregations, such as sums and averages. Lag/lead analysis enables direct inter-row references so you can calculate period-to-period changes. First/last analysis enables you to find the first or last value in an ordered group.
Other features include the CASE expression. CASE expressions provide if-then logic useful in many situations.
To enhance performance, analytic functions can be parallelized: multiple processes can simultaneously run all of these statements. These capabilities make calculations easier and more efficient, thereby enhancing database performance, scalability, and simplicity.

SQL for Modeling

Oracle's MODEL clause brings a new level of power and flexibility to SQL calculations. With the MODEL clause, you can create a multidimensional array from query results and then apply formulas to this array to calculate new values. The formulas can range from basic arithmetic to simultaneous equations using recursion. For some applications, the MODEL clause can replace PC-based spreadsheets. Models in SQL leverage Oracle's strengths in scalability, manageability, collaboration, and security. The core query engine can work with unlimited quantities of data. By defining and executing models within the database, users avoid transferring large datasets to and from separate modeling environments. Models can be shared easily across workgroups, ensuring that calculations are consistent for all applications. Just as models can be shared, access can also be controlled precisely with Oracle's security features. With its rich functionality, the MODEL clause can enhance all types of applications.

0 comments:

Post a Comment

newer post older post Home