Creating 'Put' requests

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Creating 'Put' requests

Matthew Johnson

Hi all,

 

Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg:

 

session.execute("INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');");

 

But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already?

 

Thanks in advance,

Matt

 

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Jim Witschey
Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson
<[hidden email]> wrote:

> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to it a
> set of column/value pairs, and send the Put to the server. So far in
> Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really
> like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title, album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine
> Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable, and
> doesn’t lend itself well to dynamically creating row data based on various
> conditions. Is there a way to send a Java object, populated with the desired
> column/value pairs, to the server instead of executing an insert statement?
> Would this require some other library, or does the DataStax Java driver
> support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Creating 'Put' requests

Matthew Johnson
Hi Jim,

This would still involve either having a fixed(ish) schema, with a handful
of pre-written prepared statements that I fill the values into, or some
rather horrific StringBuilder that generates the statement based on some
logic. Prepared Statements work great, for example, for inserting users
where the columns are known eg 'firstname, lastname, postcode', but what
about when you want to add timeseries data with the timestamp as the column?
I would have to do something like (ignore incorrect syntax for now):

        String myQuery = "INSERT INTO myKeyspace.myTable (id," +
myPojo.getTimestamp() + "," + myPojo.getMySecondTimestamp() + ") VALUES
(?,?, ?);";
        Session.execute(boundStatement.bind("row1",
myPojo.getValue(),myPojo.getSecondValue());

Which is already a bit ugly, but when you start talking about a handful or a
few dozen columns, it will become unmanageable.

In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?

Thanks!
Matt


-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:

> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Creating 'Put' requests

Matthew Johnson

Hi Jim,

 

I think I have found what I was looking for here:

 

https://gist.github.com/yangzhe1991/10349122

 

I would end up with code that looks something like this:

 

       public void createSchema() {

              System.out.println("CREATING SCHEMA");

             

              Create createTable = SchemaBuilder.createTable("simplex", "mytable1");

              createTable = createTable.ifNotExists();

              createTable = createTable.addPartitionKey("id", DataType.text());

              createTable = createTable.addColumn("title", DataType.text());

              createTable = createTable.addColumn("author", DataType.text());

             

              session.execute(createTable);

             

              System.out.println("SCHEMA CREATED");

       }

 

      

       public void loadData() {

              System.out.println("LOADING DATA");

             

              Insert builder = QueryBuilder.insertInto("simplex", "mytable1");

              builder = builder.value("id", "myid2");

              builder = builder.value("title", "mytitle2");

              builder = builder.value("author", "myauthor2");

              builder = builder.value("author2", "myauthor2_2");

              session.execute(builder);

             

              System.out.println("DATA LOADED");

       }

 

 

But do let me know if you know of any problems (performance or otherwise) with this approach. I am using a relatively new version of datastax connector (Cassandra-driver-core-2.1.5) and none of these methods are deprecated so I am assuming they are ok to use in conjunction with CQL3.

 

Unfortunately it seems that I was misinformed on the “dynamically creating timeseries columns” feature, and that this WAS deprecated in CQL3 – in order to dynamically create columns I would have to issue an ‘ALTER TABLE’ statement for every new column. I read one suggestions which is to use collections instead - so basically have a single pre-defined column which is a Map, say, and then add ‘timestamp : value’ into that map instead of a new column for every timestamp. Would you say this is an acceptable approach?

 

Many thanks,

Matt

 

PS apologies for the noobness!!

 

 

-----Original Message-----
From: Matthew Johnson [mailto:[hidden email]]
Sent: 23 April 2015 15:16
To: [hidden email]
Subject: RE: Creating 'Put' requests

 

Hi Jim,

 

This would still involve either having a fixed(ish) schema, with a handful of pre-written prepared statements that I fill the values into, or some rather horrific StringBuilder that generates the statement based on some logic. Prepared Statements work great, for example, for inserting users where the columns are known eg 'firstname, lastname, postcode', but what about when you want to add timeseries data with the timestamp as the column?

I would have to do something like (ignore incorrect syntax for now):

 

        String myQuery = "INSERT INTO myKeyspace.myTable (id," +

myPojo.getTimestamp() + "," + myPojo.getMySecondTimestamp() + ") VALUES (?,?, ?);";

        Session.execute(boundStatement.bind("row1",

myPojo.getValue(),myPojo.getSecondValue());

 

Which is already a bit ugly, but when you start talking about a handful or a few dozen columns, it will become unmanageable.

 

In HBase, we do something like:

 

        Put put = new Put(id);

        put.add(myPojo.getTimestamp(), myPojo.getValue());

        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());

        server.put(put);

 

Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server?

 

Thanks!

Matt

 

 

-----Original Message-----

From: Jim Witschey [[hidden email]]

Sent: 23 April 2015 14:46

To: [hidden email]

Subject: Re: Creating 'Put' requests

 

Are prepared statements what you're looking for?

 

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html

Jim Witschey

 

Software Engineer in Test | [hidden email]

 

 

 

 

 

On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>

wrote:

> Hi all,

> Currently looking at switching from HBase to Cassandra, and one big

> difference so far is that in HBase, we create a ‘Put’ object, add to

> it a set of column/value pairs, and send the Put to the server. So far

> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I

> really like for prototyping eg:

> session.execute("INSERT INTO simplex.playlists (id, song_id, title,

> album,

> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye

> Blackbird','Joséphine Baker');");

> But for more complicated code this will quickly become unmanageable,

> and doesn’t lend itself well to dynamically creating row data based on

> various conditions. Is there a way to send a Java object, populated

> with the desired column/value pairs, to the server instead of

> executing an insert statement?

> Would this require some other library, or does the DataStax Java

> driver support this already?

> Thanks in advance,

> Matt

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Alex Popescu-2

On Thu, Apr 23, 2015 at 8:50 AM, Matthew Johnson <[hidden email]> wrote:
Unfortunately it seems that I was misinformed on the “dynamically creating timeseries columns” feature, and that this WAS deprecated in CQL3 – in order to dynamically create columns I would have to issue an ‘ALTER TABLE’ statement for every new column. I read one suggestions which is to use collections instead - so basically have a single pre-defined column which is a Map, say, and then add ‘timestamp : value’ into that map instead of a new column for every timestamp. Would you say this is an acceptable approach?

Depending on the data model and the queries your application will use, you'll be using either clustering columns or collections (or a combination). If you need help modeling, you could start a new thread with the relevant details and I'm pretty sure you'll get some good suggestions here.


--
Bests,

Alex Popescu | @al3xandru
Sen. Product Manager @ DataStax

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Philo Yang
In reply to this post by Matthew Johnson


2015-04-23 22:16 GMT+08:00 Matthew Johnson <[hidden email]>:
In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?




For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 

 
Thanks!
Matt


-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:
> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>



--
Thanks,
Phil Yang

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Jens Rantil
Matthew,

Maybe this could also be of interest: http://projects.spring.io/spring-data-cassandra/

Cheers,
Jens

On Fri, Apr 24, 2015 at 12:50 PM, Phil Yang <[hidden email]> wrote:


2015-04-23 22:16 GMT+08:00 Matthew Johnson <[hidden email]>:
In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?




For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 

 
Thanks!
Matt


-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:
> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>



--
Thanks,
Phil Yang




--
Jens Rantil
Backend engineer
Tink AB

Phone: +46 708 84 18 32

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Jonathan Haddad

On Fri, Apr 24, 2015 at 1:21 PM Jens Rantil <[hidden email]> wrote:
Matthew,

Maybe this could also be of interest: http://projects.spring.io/spring-data-cassandra/

Cheers,
Jens

On Fri, Apr 24, 2015 at 12:50 PM, Phil Yang <[hidden email]> wrote:


2015-04-23 22:16 GMT+08:00 Matthew Johnson <[hidden email]>:
In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?




For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 

 
Thanks!
Matt


-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:
> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>



--
Thanks,
Phil Yang




--
Jens Rantil
Backend engineer
Tink AB

Phone: +46 708 84 18 32

Reply | Threaded
Open this post in threaded view
|

Re: Creating 'Put' requests

Jonathan Haddad
In reply to this post by Philo Yang
To add to Phil's point, there's no circumstance in which I would use an unlogged batch, under load I have yet to hear it do anything other than increase GC pauses.

On Fri, Apr 24, 2015 at 11:50 AM Phil Yang <[hidden email]> wrote:
2015-04-23 22:16 GMT+08:00 Matthew Johnson <[hidden email]>:
In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?




For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 

 
Thanks!
Matt


-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:
> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>



--
Thanks,
Phil Yang

Reply | Threaded
Open this post in threaded view
|

RE: Creating 'Put' requests

Matthew Johnson

The object-mapping API is very interesting, I’ll check that out, thanks. I believe I have found what I was looking for in terms of programmatically inserting data using the following syntax:

 

              Insert builder = QueryBuilder.insertInto("simplex", "mytable1");

              builder = builder.value("id", "myid2");

              builder = builder.value("title", "mytitle2");

              session.execute(builder);

 

Many thanks for all the valuable help so far!

 

Cheers,

Matt

 

From: Jonathan Haddad [mailto:[hidden email]]
Sent: 24 April 2015 14:15
To: [hidden email]
Subject: Re: Creating 'Put' requests

 

To add to Phil's point, there's no circumstance in which I would use an unlogged batch, under load I have yet to hear it do anything other than increase GC pauses.

On Fri, Apr 24, 2015 at 11:50 AM Phil Yang <[hidden email]> wrote:

2015-04-23 22:16 GMT+08:00 Matthew Johnson <[hidden email]>:

In HBase, we do something like:

        Put put = new Put(id);
        put.add(myPojo.getTimestamp(), myPojo.getValue());
        put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue());
        server.put(put);

Is there any similar mechanism in Cassandra Java driver for creating these
inserts programmatically? Or, can the 'session.execute' take a list of
commands so that each column can be inserted as its own insert statement but
without the overhead of multiple calls to the server?

 

 

 

For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 

 

 

Thanks!
Matt



-----Original Message-----
From: Jim Witschey [mailto:[hidden email]]
Sent: 23 April 2015 14:46
To: [hidden email]
Subject: Re: Creating 'Put' requests

Are prepared statements what you're looking for?

http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html
Jim Witschey

Software Engineer in Test | [hidden email]





On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson <[hidden email]>
wrote:


> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBase, we create a ‘Put’ object, add to
> it a set of column/value pairs, and send the Put to the server. So far
> in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I
> really like for prototyping eg:
>
>
>
> session.execute("INSERT INTO simplex.playlists (id, song_id, title,
> album,
> artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye
> Blackbird','Joséphine Baker');");
>
>
>
> But for more complicated code this will quickly become unmanageable,
> and doesn’t lend itself well to dynamically creating row data based on
> various conditions. Is there a way to send a Java object, populated
> with the desired column/value pairs, to the server instead of executing an
> insert statement?
> Would this require some other library, or does the DataStax Java
> driver support this already?
>
>
>
> Thanks in advance,
>
> Matt
>
>


 

--

Thanks,

Phil Yang