Saturday, 19 December 2015

How to monitor Copy Command?



You can check COPY loads using HP Vertica “functions” and “LOAD_STREAMS system table”.

HP Vertica Functions:

1. GET_NUM_ACCEPTED_ROWS : 
To get the number of accepted rows, use the GET_NUM_ACCEPTED_ROWS function

Query:
SELECT GET_NUM_ACCEPTED_ROWS()

2. GET_NUM_REJECTED_ROWS : 
To check the number of rejected rows, use the GET_NUM_REJECTED_ROWS function

Query:
SELECT GET_NUM_REJECTED_ROWS()

3.  Include the CURRENT_LOAD_SOURCE function as a part of the COPY statement:
When you include the CURRENT_LOAD_SOURCE function as a part of the COPY statement, the input file name or value computed from it, can be inserted into a column.

To insert the file names into a column from multiple source files:
COPY t (c1, c2, c3 as CURRENT_LOAD_SOURCE()) FROM '/home/load_file_1' ON  exampledb_node02,
'/home/load_file_2' ON exampledb_node03 DELIMITER ',';


LOAD_STREAMS System Table:
    HP Vertica includes a set of system tables that include monitoring information, as described in Using System Tables. The LOAD_STREAMS system table includes information about load stream metrics from COPY and COPY FROM VERTICA statements, so you can query table values to get COPY metrics.

To see all table columns:
SELECT * FROM LOAD_STREAMS;

For More Info: Click Here

What is Copy Command in Vertica?



     Bulk loads data from one or more files or pipes on a cluster host into a Vertica database. (See LCOPY to load from a data file on a client system using ODBC.)
COPY can load data in one of three formats:
  • Text with delimiters (the default format)
  • Native binary using the NATIVE keyword
  • Native varchar using the NATIVE VARCHAR keyword


For More Info: Click Here

Saturday, 28 November 2015

What ANALYZE_STATISTICS will do?



         ANALYZE_STATISTICS is a DDL operation that auto-commits the current transaction, if any. The ANALYZE_STATISTICS function reads a fixed, 10 percent of disk contents to aggregate sample data for statistical analysis. To obtain a larger (or smaller) data sampling, use the ANALYZE_HISTOGRAM function, which lets you specify the percent of disk to read. Analyzing more that 10 percent disk space takes proportionally longer to process but results in a higher level of sampling accuracy. ANALYZE_STATISTICS is supported on local temporary tables, but not on global temporary tables.


For more information: Click Here

Saturday, 21 November 2015

What is DBD in Vertica?



    The Database Designer is a tool that recommends a physical database design (projections) that provides the best performance to answer your query needs. This is useful because it minimizes the time the DBA spends on physical database tuning and provides the ability to re-design the database incrementally to optimize for changing workloads over time.
You can run the Database Designer before or after a Vertica database has been deployed, and it runs as a background process.
When you use the Database Designer to create a design, the DBA provides the following inputs:
·                     Logical schema (CREATE TABLE statements)
·                     Sample data
·                     A sample set of queries that represent what you'd normally run
·                     A K-safety level

For More Info:
https://my.vertica.com/docs/5.0/HTML/Master/14415.htm

What is Hash-Segmentation and Unsegmented?




Hash-Segmentation:

    Segments a projection evenly and distributes the data across nodes using a built-in hash function. Creating a projection with hash segmentation results in optimal query execution. HP Vertica recommends segmenting large tables.


Unsegmented:


    Automatically replicates the unsegmented projection on each node. To perform distributed query execution, HP Vertica requires an unsegmented copy of each small table superprojection on each node.

What is K-Safe in Vertica?



      Vertica uses the concept of K-Safety for failure recovery. The K value represents the maximum number of nodes in a database that can fail and recover with no loss of data. In Vertica, the value of K can be zero (0), one (1), or two (2). The Physical Schema design must meet certain requirements. To create designs that are K-Safe, Vertica recommends using the Database Designer.
To determine the number of nodes required to meet your K-Safety needs, use the formula: K<N/2
where :
·                     K is the K-Safety value.
·                     N is the number of nodes.
The value of K can be 1 or 2 only when the physical schema design meets certain redundancy requirements. See Physical Schema. To create designs that are K-Safe, Vertica recommends that you use the Description: https://my.vertica.com/docs/4.1/HTML/Master/arrowright.gifDatabase Designer.

By default, Vertica creates K-Safe superprojections when the database has a K-Safety greater than 0 (K>0). When creating projections with the Database Designer, projection definitions that meet K-Safe design requirements are recommended and marked with the K-Safety level. Note the output from running the optimized design script generated by the Database Designer in the following example:

Table and Projection are physical or logical?



Table:
    Tables do not occupy any physical storage at all in Vertica.

Projection:
    Physical storage consists of collections of table columns called projections.