https://m.youtube.com/watch?v=B_HTdrTgGNs
At 13 best replication. Asynchronous. No need for leader election, etc. There are companies that are running in active active.
At 1505 visual example of righty to a single node.
At 16 the mutation is first written to the commit log, then and that is independent only, then it is then A memtable is updated
At 1721 he likes is the dead simple right path
At 1825 the memory fills up and it eventually create and SS table where it sequentially writes the data in the table to disk. Because of the sequential it is much faster.that 2050 SS table. It is a beautiful
At 20 to 30 compaction and merging the SS table segments. Are they called segments?
At 2730 they use MD five hashing to 128 bitsy
27:37 consistent hashing to a 128 bit number
27:46 Token ring
29:40 replication factor
31:34 virtual nodes (or vnodes). Mentions that it is mentioned in paper-amazon-dynamo and also talks about how the vnodes are non-adjacent.
talk-intro-to-cassandra#operation-changes-during-day1At 3750 he talks about doing downtime during the day because taking the single no it's off-line this not affect the cluster. talk-intro-to-cassandra#operation-changes-during-day1
talk-intro-to-cassandra#consistency-choice-at-read-write1At 39 the consistency level is set with every read and write so at an application level you can decide what you need for that specific call. talk-intro-to-cassandra#consistency-choice-at-read-write1
In Youtube comments:
talk-intro-to-cassandra#ssds-better-at-random-but-not-perfect1 "You are right about SSDs and this is why we recommend them. There is still a bit of overhead on the transport layer when issuing random seeks. If you were to ask for a contiguous set of blocks and compared that with something issuing pure random reads, the random reads will have a much higher 95th percentile." talk-intro-to-cassandra#ssds-better-at-random-but-not-perfect1
At 4105 a local quorum is a quorum with in the data center
At 4350 a lot of zookeeper zoo bad stories
At 4940 inserts always over right. You could do and if it exists, but that is PAX us.
At 5030 he has a three hour data modeling and seek you out talk on YouTube
At maybe 53 reversed the order in on the storage engine
At 5930 and audience question asks about the difference between Cassondra and dynamo. Dynamo uses vector clocks in for semantic conflict resolution pushes the resolution to the application, Cassondra uses real clogs and last right wins
At 1:01 30 the question to her proposes that many people do not understand the trade-offs in these systems and choose the wrong set up based on bad assumptions.
At 1:02 20 Patrick mentions a paper called your bank is not consistent and explains the banks have a profit model based on eventual consistency, namely overdraft fees.
talk-intro-to-cassandra#vector-clocks-in-place-updates1At 1:03:00 Patrick draws a distinction between logging and in-place updates. He says that vector clocks are especially useful for lots of in-place updates, whereas if you do logging it is a less necessary technique because you are not updating the same values. talk-intro-to-cassandra#vector-clocks-in-place-updates1
At 1:06 10 Cassondra is operationally very straightforward because The node's are symmetric.