As I have never actually test this...I create two Python scripts to generate 1 million records for two tables...DOC_HEADER and DOC_DETAILS. These two tables share the DOCUMENT_ID, YEAR and AREA but as the information get created randomly, chances that both tables have some values in common are not really known...
For the sake of the example I create both tables DOC_HEADER and DOC_DETAILS using both Rows and Column types and build another Python script to measure the reading time...
Column_Row_SAPHANA_Test.py |
---|
import dbapi import time conn = dbapi.connect('*****', 30015, 'SYSTEM', '****') cur = conn.cursor() query = '''SELECT H."DOCUMENT_ID", H."YEAR", H."AREA", SUM("AMOUNT") AS AMOUNT FROM "BLAG"."DOC_HEADER_ROW" AS H INNER JOIN "BLAG"."DOC_DETAIL_ROW" AS D ON H."DOCUMENT_ID" = D."DOCUMENT_ID" AND H."YEAR" = D."YEAR" AND H."AREA" = D."AREA" GROUP BY H."DOCUMENT_ID", H."YEAR", H."AREA"''' start = time.clock() cur.execute(query) ret = cur.fetchall() end = time.clock() counter = 0 for row in ret: counter += 1 time_taken = end - start print("Row Store - %s records in %s seconds") %(counter, time_taken) query = '''SELECT H."DOCUMENT_ID", H."YEAR", H."AREA", SUM("AMOUNT") AS AMOUNT FROM "BLAG"."DOC_HEADER_COLUMN" AS H INNER JOIN "BLAG"."DOC_DETAIL_COLUMN" AS D ON H."DOCUMENT_ID" = D."DOCUMENT_ID" AND H."YEAR" = D."YEAR" AND H."AREA" = D."AREA" GROUP BY H."DOCUMENT_ID", H."YEAR", H."AREA"''' start = time.clock() cur.execute(query) ret = cur.fetchall() end = time.clock() counter = 0 for row in ret: counter += 1 time_taken = end - start print("Column Store - %s records in %s seconds") %(counter, time_taken) |
Both queries do the exact same thing...they do an INNER JOIN between the two tables...using ROW store for the first one and COLUMN store for the second one...
Let's run this 3 times and see what happens -;)
Benchmark |
---|
Row Store - 7669 records in 2.45972177882 seconds Column Store - 7669 records in 1.59409638594 seconds Row Store - 7669 records in 2.32650395252 seconds Column Store - 7669 records in 1.68382533783 seconds Row Store - 7669 records in 2.44930342075 seconds Column Store - 7669 records in 1.52396673192 seconds |
As we can see...we have a constant gain of about 38% faster when using Column store...
Greetings,
Blag.
Developer Experience.
No hay comentarios:
Publicar un comentario