Skip to main content

Introduction to Vertica Database

What is Vertica Analytic Database?

  • Vertica Analytic Database is designed to manage large, fast growing volume of data.
  • Vertica was developed by Vertica Systems. It was founded in 2005 by database researcher Michael Stonebraker and Andrew Palmer.
  • Vertica was acquired by Hewlett Packard in March 2011
  • Vertica Analytic Database is an innovative, ground-up implementation of a relational database management systems optimized for read-intensive workloads.
  • Vertica provides extremly fast ad hoc SQL query performance, even for very large database, making it well suited for:
    • Data warehousing
    • Data marts
    • Fraud detection
    • Call detail analysis
    • Business intelligence
    • Other query intensive applications. 
Why to use Vertica?

By this time i guess you all would be able to say- Why should we user Vertica? This answer is pretty simple- Performance

The key reasons for Vertica's performance are mentioned below: 
  • Vertica organizes data on disk as columns of values from the same attribute.This means when a query needs to access only a few columns of a particular table, only those columns need to be read from disk. Convesely, in a row oriented database like Oracle, MySQL, IBM DB2 all columns in a table are typically read from disk, which wastes I/O bandwidth.

  •  Vertica employs aggressive compression of data on disk, as well as a query execution engine that is able to keep data compressed while it is operated on. Compression in Vertica is particularly effective, as values within a column tend to be quite similar to each other and compress very well- often upto 90% . Whereas in a traditional row-oriented database, values within a row of a table are not likely to be very similar, and hence are unlikely to compress well. 

  • As the dats is compressed so aggressively , Vertica has sufficient space to store multiple copies of the data to ensure fault tolerance and to improve concurrent and ad-hoc query performace. Logical tables are decomposed and physically stored as overlapping groups of columns called "projections". and each projection is sorted on a different attribute, which optimizes them for answering queries with predicates on its sort attributes.

Vertica Database is composed of these query-optimized structures on disk, without the overhead of base tables. It's similar in concept to a database made entirely of materialized views (with no base tables)

 Next we will understand the Vertica Approach..
 

Comments

Popular posts from this blog

Vertica Analytic Database Architecture

The below mentioned diagram illustrates the basic system architecture of Vertica on a single node. Pic: Vertica.com Queries are issued in SQL to a front end that parses and optimizes queries. Vertica is internally organized into a hybrid store consisting of two storage structures: WOS and ROS Write-Optimized Store (WOS) is a data structure that generally fits into main memory and is designed to efficiently support insert and update operations. The data within the WOS is unsorted and uncompressed. Read-Opytimized Store (ROS) contains the bulk of the data in the database, and is both sorted and compressed, making it efficient to read and query. A background process called the Tuple Mover , moves data out of the WOS into ROS. As it operates on the entire WOS, the tuple mover can be very efficient, sorting many records at a time and writing them to disk as a batch. Both WOS and ROS are organized into columns, with each columns representing one attributes o

Projections in HP Vertica - 1

What  are Projections in HP Vertica?  Lets try to understand by comparing with traditional databases like - Oracle, MySQL, SQL Server etc..  In traditional database architecture, data is physically stored in table. Additionally, secondary tunning structure such as index and materialized view structure are created to improve query performance.  In contrast, table donot occupy any physical storage atallin vertica.  Physical storage c onsists of collection of table columns called projections. Projections store data in a format that optimize query execution. They are simmilar to MVs in that they store result set on disk rather then compute them each time they areused in a query. The result set are automatically refreshed whenever data values are inserted, appended or changed.  Projections are not aggregated but rather store row in a table e.g. full atomic detail Definition:  Optimized collection of table columns that provide physical storage for data. A proj