Can cqlsh COPY command be run through

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Can cqlsh COPY command be run through

Tiwari, Tarun

Hi,

 

I am looking for, if the CQLSH COPY command be run using the spark scala program. Does it benefit from the parallelism achieved by spark.

I am doing something like below:

 

val conf = new SparkConf(true).setMaster("spark://Master-Host:7077") .setAppName("Load Cs Table using COPY TO")

lazy val sc = new SparkContext(conf)

 

import com.datastax.spark.connector.cql.CassandraConnector

 

CassandraConnector(conf).withSessionDo { session =>

                                session.execute("truncate wfcdb.test_wfctotal;")

                                session.execute("COPY wfcdb.test_wfctotal (wfctotalid, timesheetitemid, employeeid, durationsecsqty, wageamt, moneyamt, applydtm, laboracctid, paycodeid, startdtm, stimezoneid, adjstartdtm, adjapplydtm, enddtm, homeaccountsw, notpaidsw, wfcjoborgid, unapprovedsw, durationdaysqty, updatedtm, totaledversion, acctapprovalnum) FROM '/home/analytics/Documents/wfctotal.dat' WITH DELIMITER = '|' AND HEADER = true;")

 

Regards,

Tarun Tiwari | Workforce Analytics-ETL | Kronos India

M: +91 9540 28 27 77 | Tel: +91 120 4015200

Kronos | Time & Attendance • Scheduling • Absence Management • HR & Payroll • Hiring • Labor Analytics

Join Kronos on: kronos.com | Facebook | Twitter | LinkedIn | YouTube

 

Reply | Threaded
Open this post in threaded view
|

Re: Can cqlsh COPY command be run through

DuyHai Doan
Short answer is no.

Whenever you access the session object of the Java driver directly (using withSessionDo{...}), you bypass the data locality optimisation made by the connector



On Sun, Apr 5, 2015 at 9:53 AM, Tiwari, Tarun <[hidden email]> wrote:

Hi,

 

I am looking for, if the CQLSH COPY command be run using the spark scala program. Does it benefit from the parallelism achieved by spark.

I am doing something like below:

 

val conf = new SparkConf(true).setMaster("spark://Master-Host:7077") .setAppName("Load Cs Table using COPY TO")

lazy val sc = new SparkContext(conf)

 

import com.datastax.spark.connector.cql.CassandraConnector

 

CassandraConnector(conf).withSessionDo { session =>

                                session.execute("truncate wfcdb.test_wfctotal;")

                                session.execute("COPY wfcdb.test_wfctotal (wfctotalid, timesheetitemid, employeeid, durationsecsqty, wageamt, moneyamt, applydtm, laboracctid, paycodeid, startdtm, stimezoneid, adjstartdtm, adjapplydtm, enddtm, homeaccountsw, notpaidsw, wfcjoborgid, unapprovedsw, durationdaysqty, updatedtm, totaledversion, acctapprovalnum) FROM '/home/analytics/Documents/wfctotal.dat' WITH DELIMITER = '|' AND HEADER = true;")

 

Regards,

Tarun Tiwari | Workforce Analytics-ETL | Kronos India

M: +91 9540 28 27 77 | Tel: <a href="tel:%2B91%20120%204015200" value="+911204015200" target="_blank">+91 120 4015200

Kronos | Time & Attendance • Scheduling • Absence Management • HR & Payroll • Hiring • Labor Analytics

Join Kronos on: kronos.com | Facebook | Twitter | LinkedIn | YouTube

 


Reply | Threaded
Open this post in threaded view
|

RE: Can cqlsh COPY command be run through

Tiwari, Tarun

Thanks.

 

That was kind of a logical guess is was having on it. Thanks for confirming.

 

 

 

From: DuyHai Doan [mailto:[hidden email]]
Sent: Wednesday, April 08, 2015 1:05 AM
To: [hidden email]
Subject: Re: Can cqlsh COPY command be run through

 

Short answer is no.

 

Whenever you access the session object of the Java driver directly (using withSessionDo{...}), you bypass the data locality optimisation made by the connector

 

 

 

On Sun, Apr 5, 2015 at 9:53 AM, Tiwari, Tarun <[hidden email]> wrote:

Hi,

 

I am looking for, if the CQLSH COPY command be run using the spark scala program. Does it benefit from the parallelism achieved by spark.

I am doing something like below:

 

val conf = new SparkConf(true).setMaster("spark://Master-Host:7077") .setAppName("Load Cs Table using COPY TO")

lazy val sc = new SparkContext(conf)

 

import com.datastax.spark.connector.cql.CassandraConnector

 

CassandraConnector(conf).withSessionDo { session =>

                                session.execute("truncate wfcdb.test_wfctotal;")

                                session.execute("COPY wfcdb.test_wfctotal (wfctotalid, timesheetitemid, employeeid, durationsecsqty, wageamt, moneyamt, applydtm, laboracctid, paycodeid, startdtm, stimezoneid, adjstartdtm, adjapplydtm, enddtm, homeaccountsw, notpaidsw, wfcjoborgid, unapprovedsw, durationdaysqty, updatedtm, totaledversion, acctapprovalnum) FROM '/home/analytics/Documents/wfctotal.dat' WITH DELIMITER = '|' AND HEADER = true;")

 

Regards,

Tarun Tiwari | Workforce Analytics-ETL | Kronos India

M: +91 9540 28 27 77 | Tel: <a href="tel:%2B91%20120%204015200" target="_blank">+91 120 4015200

Kronos | Time & Attendance • Scheduling • Absence Management • HR & Payroll • Hiring • Labor Analytics

Join Kronos on: kronos.com | Facebook | Twitter | LinkedIn | YouTube