Pic: Vertica.com
- Queries are issued in SQL to a front end that parses and optimizes queries.
- Vertica is internally organized into a hybrid store consisting of two storage structures: WOS and ROS
- Write-Optimized Store (WOS) is a data structure that generally fits into main memory and is designed to efficiently support insert and update operations.
- The data within the WOS is unsorted and uncompressed.
- Read-Opytimized Store (ROS) contains the bulk of the data in the database, and is both sorted and compressed, making it efficient to read and query.
- A background process called the Tuple Mover, moves data out of the WOS into ROS.
- As it operates on the entire WOS, the tuple mover can be very efficient, sorting many records at a time and writing them to disk as a batch.
- Both WOS and ROS are organized into columns, with each columns representing one attributes of a table.
- Each column may be stored in one or more projections that represent partially redundant copies of the data in the database.
How actual data is stored in Vertica?
The below mentioned diagram illustrates how logical data in an example sales table is physically stored as columns.
Pic: Vertica.com
- As stated earlier, each column may be stored in on or more Projections that represent partially redundant copies of the data in the database.
- For example, sales table might be stored as two projections, one called sales-prices with the columns (oid,pid, date,price) and other called-salescustomers with the columns (oid,pid,cust).
- Each of these projections has a sort order that specifies how the data in the projection is arranged on disk. e.g. The sales-customers projection mighted be sorted on customer-id. This makes it efficient for totaling all of the products that a customer brought. By storing several overlapping projctions of a table in different sort orders, Vertica can be efficient at answersing many different types of queries.
- Vertica's Database Designer automatically selects a good set of overlapping projections for a particular table based on set of queries issued to that tableover time.
- It may seem that redundantly storing data in multiple projection is wastage of disk space. However, Vertica includes aggressive column-oriented compression schemes that allows it to reduce the amount of space a particular projection takes up in the ROS as much as 90%.
Comments
Post a Comment