Objective Apache Cassandra Questions and Answers pdf

40). When to avoid secondary indexes?
Try not using secondary indexes on columns contain a high count of unique values and that will produce few results.

41). I have a row or key cache hit rate of 0.XX123456789 reported by JMX. Is that XX% or 0.XX% ?
XX%

42). What happens to existing data in my cluster when I add new nodes?
When a new nodes joins a cluster, it will automatically contact the other nodes in the cluster and copy the right data to itself.

43). What are "Seed Nodes" in Cassandra?
A seed node in Cassandra is a node that is contacted by other nodes when they first start up and join the cluster. A cluster can have multiple seed nodes. Seed node helps the process of bootstrapping for a new node joining a cluster. Its recommended to use the 2 seed node per data center.

44). When to avoid secondary indexes?
Try not using secondary indexes on columns contain a high count of unique values and that will produce few results.

45). What are the befefits of NoSQL over relational database?
NoSQL overcome the weaknesses that the relational data model does not address well, which are as follows:
Huge volume of sructured, semi-structured, and unstructured data
Flexible data model(schema) that is easy to change
Scalability and performance for web-scale applications
Lower cost
Impedance mismatch between the relational data model and object-oriented programming
Built-in replication
Support for agile software development

46). What ports does Cassandra use?
By default, Cassandra uses 7000 for cluster communication, 9160 for clients (Thrift), and 8080 for JMX. These are all editable in the configuration file or bin/cassandra.in.sh (for JVM options). All ports are TCP.

47). What do you understand by High availability?
A high availability system is the one that is ready to serve any request at any time. High avaliability is usually achieved by adding redundancies. So, if one part fails, the other part of the system can serve the request. To a client, it seems as if everything worked fine.

48). How Cassandra provide High availability feature?
Cassandra is a robust software. Nodes joining and leaving are automatically taken care of. With proper settings, Cassandra can be made failure resistant. That means that if some of the servers fail, the data loss will be zero. So, you can just deploy Cassandra over cheap commodity hardware or a cloud environment, where hardware or infrastructure failures may occur.

49). Who uses Cassandra?
Cassandra is in wide use around the world, and usage is growing all the time. Companies like Netflix, eBay, Twitter, Reddit, and Ooyala all use Cassandra to power pieces of their architecture, and it is critical to the day-to-da operations of those organizations. to date, the largest publicly known Cassandra cluster by machine count has over 300TB of data spanning 400 machines.
Because of Cassandra's ability to handle high-volume data, it works well for a myriad of applications. This means that it's well suited to handling projects from the high-speed world of advertising technology in real time to the high-volume world of big-data analytics and everything in between. It is important to know your use case before moving forward to ensure things like proper deployment and good schema design.

50). When to use secondary indexes?
You want to query on a column that isn't the primary key and isn't part of a composite key. The column you want to be querying on has few unique values (what I mean by this is, say you have a column Town, that is a good choice for secondary indexing because lots of people will be form the same town, date of birth however will not be such a good choice).

0 comments: