Skip to main content

Projections in HP Vertica - 1

What  are Projections in HP Vertica? 

Lets try to understand by comparing with traditional databases like - Oracle, MySQL, SQL Server etc.. 
  • In traditional database architecture, data is physically stored in table. Additionally, secondary tunning structure such as index and materialized view structure are created to improve query performance. 
  • In contrast, table donot occupy any physical storage atallin vertica. 
  • Physical storage consists of collection of table columns called projections.
  • Projections store data in a format that optimize query execution. They are simmilar to MVs in that they store result set on disk rather then compute them each time they areused in a query. The result set are automatically refreshed whenever data values are inserted, appended or changed. 
  • Projections are not aggregated but rather store row in a table e.g. full atomic detail

Definition: 
Optimized collection of table columns that provide physical storage for data. A projection can contain some or all the columns of one or more tables. A projection that contains all of the columns of a table is called super-projection. A projection that contain one or more tables is called pre-join projection.

What are the benefits of Projections?

  • Projections allow for the sorting of data in any order ( even if different from the source tables). This enhances query performance and compression.
  • Projections delivers high availability optimized for performance, since the reduntant copies of data are always actively used in analytics. We have the ability to automatically store the redundant copy using a different sort order. This provides the same benefits as a secondary index in a more efficient manner.
  • Projections do not require a batch update window. Data is automatically available upon loads.
  • Projections are transparent to end-users and SQL. The Vertica query optimizer automatically picks the best projections to use for any query.
  • Projections are dynamic and can ve added/changed at ay time without stopping the database.
Note: Vertica's projections represent collections of columns (so you can say- It's a table), but they are optimized for analytics at the physical storage structure level and are not constrained by the logical schema.

As we have covered the simple concept of Vertica's projections. Now as we have a basic understanding of that projections are. Lets go in more detailin the next post.  Vertica Projections -2

Please share it among your friends. Lets learn together!!






Comments

  1. Hi, Nice article. please add more details about vertica

    ReplyDelete

Post a Comment

Popular posts from this blog

Vertica Analytic Database Architecture

The below mentioned diagram illustrates the basic system architecture of Vertica on a single node. Pic: Vertica.com Queries are issued in SQL to a front end that parses and optimizes queries. Vertica is internally organized into a hybrid store consisting of two storage structures: WOS and ROS Write-Optimized Store (WOS) is a data structure that generally fits into main memory and is designed to efficiently support insert and update operations. The data within the WOS is unsorted and uncompressed. Read-Opytimized Store (ROS) contains the bulk of the data in the database, and is both sorted and compressed, making it efficient to read and query. A background process called the Tuple Mover , moves data out of the WOS into ROS. As it operates on the entire WOS, the tuple mover can be very efficient, sorting many records at a time and writing them to disk as a batch. Both WOS and ROS are organized into columns, with each columns representing one attributes o...

Introduction to Vertica Database

What is Vertica Analytic Database? Vertica Analytic Database is designed to manage large, fast growing volume of data. Vertica was developed by Vertica Systems. It was founded in 2005 by database researcher Michael Stonebraker and Andrew Palmer. Vertica was acquired by Hewlett Packard in March 2011 Vertica Analytic Database is an innovative, ground-up implementation of a relational database management systems optimized for read-intensive workloads. Vertica provides extremly fast ad hoc SQL query performance, even for very large database, making it well suited for: Data warehousing Data marts Fraud detection Call detail analysis Business intelligence Other query intensive applications.   Why to use Vertica? By this time i guess you all would be able to say- Why should we user Vertica? This answer is pretty simple- Performance The key reasons for Vertica's performance are mentioned below:   Vertica organizes data on disk as columns of values from the sam...