I would like to use filtering using server-side data selection and cassandra
spark connector. In fact we have many sensors that send every 1 S value, we are interested in these data collections using months, days, hours, etc. I have proposed the following data model:
Create Table Project 1 (Preferences (year, month, load_benecaser), day, hour, estimate_time, sensor_id)
one-year duration, month int, load,
Again, we Interesting in a 2014-December data collection They used to be load-loaded with these (0,1,2,3). So they are 4 different divisions .
We are using cassandra
SPARC connector version 1.1.1, and we used a combination by query to collect all prices for one hour.
Processing time for 4,341,390 tuples takes Spine 11min to return the result. Now the issue is that we are using 5 nodes , although SPARC only uses O. No worker to work Can you suggest updates to queries or data models?
This feature is in the connector, it is. You can create an arbitrary RDD with values and then use it as the source of the key to get data from the Casendar table. Or in other words - Add an arbitrary RDD to the Casanda RDD In your case, arbitrary RDD will include 4 different Tuleles load loaders value. For more information, see SCC 1.2 has been released recently and it is probably compatible with SPARC 1.1 (although it has been made for Spark 1.2).
No comments:
Post a Comment