Cloud Whatis

Dictionary of cloud stuff. (Work in progress…)

See also: Cloud Native (Bookmarks).

(Skip preliminaries — to dictionary.)

Format

  1. I am considering converting this into an app, adding features; this post format is a quick-and-dirty attempt at releasing “early” — this is a small sample, I’ve already well over 400 entries.
  2. Not showing timestamps per term, yet.
  3. Still thinking about RDFa, microformats… (Exposes how Q&D I did data modeling for this. And how hard taxonomies are. ;o)

Classifications

  1. My attempt at a more (semi-)structured knowledge base.
  2. Also, to normalize (standardize?) and de-hype terminology, characterizations.
  3. … turned out to be an impossible task.
  4. Recommended resources:
    1. Detailed classifications of many databases done by dbdb — are linked as canonical.
    2. CNCF SIG Storage whitepaper gives a rudimentary taxonomy.

Dictionary

(Reminder: use Ctrl+F, or “/” on FF. ;o)

Key topics: NoSQL, Kubernetes, continuous, microservices, storage, Hadoop

    1. h
    2. c
    3. doc
    4. dbdb
    5. W
    6. Tw

    Aerospike

    SSD-optimized distributed key-value database, replicated, secondary indexes.
    Topics
    Data model
    Query custom API
    Implementation C
    Stored procedures Lua
    Created 2009
    1. Jepsen: Aerospike 3.99.0.3 / 2018: detailed analysis.
    2. Jepsen: Aerospike / 2015: rigorous analysis, especially of what ACID means, critique of version 3.5.4.

    1. c
    2. dbdb

    Akutan (formerly Beam)

    Distributed knowledge graph store — aka RDF or triple store.
    Topics
    Data model
    Implementation Go
    Created 2019

    1. h
    2. c
    3. doc
    4. CNCF
    5. W
    6. Tw

    Alluxio (formerly Tachyon)

    Virtual distributed file system, used as data orchestration layer, provides multiple data access interfaces, unified mounting name-space, and hierarchical cache, between cloud storage and, eg, Kubernetes workloads, Presto, Spark, Hive, Kubeflow. Multi-cloud.
    Topics
    Implementation Java
    Scale petabyte
    Created 2013
    1. CSI implementation to provide POSIX access to Alluxio in containerized environments such as Kubernetes
    2. Using Alluxio to Optimize and Improve Performance of Kubernetes-Based Deep Learning in the Cloud / 2020-05: whitepaper (PDF), detailed usecase, performance analysis.
    3. Improving Presto Latencies with Alluxio Data Caching / Rohit Jain, et al, 2020-06: architecture, performance benchmarks.

    1. W

    Ant (Apache Ant)

    Make-like build tool.
    Topics
    Implementation Java
    Created 2000

    1. c

    AnyRPC

    Multi-protocol remote procedure call C++ library; supports JSON-RPC, XML-RPC, MessagePack RPC.
    Topics
    Implementation C++

    1. h
    2. c
    3. img
    4. dbdb
    5. CNCF
    6. W
    7. SO
    8. Tw

    ArangoDB (formerly AvocadoDB)

    Multi-model distributed database, JSON, in-memory with persistent backend. Joins, either multi-collections transactions for ACID, or single-document transactions for performance.
    Topics
    Data model
    Query AQL, JSON oriented
    Implementation C++, JavaScript
    Stored procedures JavaScript (as microservices)
    Created 2011
    1. ArangoDB vs Neo4j: detailed comparison; NB: APIs, QL, and clustering.
    2. ArangoDB Foxx repositories

    1. h
    2. dbdb
    3. W

    Aurora (Amazon Aurora)

    AWS managed (by RDS) RDBMS compatible with MySQL and PostgreSQL. Distributed, fault-tolerant, self-healing, low-latency read replicas, point-in-time recovery, continuous backup to S3, and replication across three Availability Zones.
    Topics
    Created 2014
    Alternatives

    1. h
    2. c
    3. W

    Brooklyn (Apache Brooklyn)

    Framework for modeling, deploying and managing distributed applications defined using declarative YAML blueprints.
    Topics
    Implementation Java
    Created 2012

    1. h
    2. c
    3. W
    4. Tw

    Buildr (Apache Buildr)

    Build tool for Java-based applications.
    Topics
    Implementation Ruby

    1. h
    2. W

    Calcite (Apache Calcite)

    SQL parser, relational algebra API, query planning engine.
    Topics

    1. W

    Cayenne (Apache Cayenne)

    ORM framework.
    Topics

    1. CNCF

    continuous (CI/CD: continuous integration/​delivery/​deployment)

    Topics
    1. CD tools drawbacks / discussion on Reddit /r/devops, 2019-04: *Spinnaker, *GitLab, *Helm…
    2. CNCF SIG Application Delivery Charter: scope and topics of the lifecycle of cloud-native applications.
    3. Model of Cloud-Native App Delivery | Google Slides

    1. h
    2. c
    3. doc
    4. img
    5. dbdb
    6. CNCF
    7. W
    8. SO
    9. Tw

    Couchbase (Couchbase Server, formerly Membase)

    Distributed (shared-nothing) database, JSON document oriented, in-memory, swapped to disk, append-only.
    Topics
    Data model
    Indexes hash, B+ tree, full text
    Query N1QL (SQL extended for JSON)
    Implementation Erlang
    Created 2009

    1. h
    2. c
    3. doc
    4. dbdb
    5. W

    CouchDB (Apache CouchDB, Cluster Of Unreliable Commodity Hardware)

    Document database.
    Topics
    Data model
    Query REST API
    Implementation Erlang
    Created 2005

    1. h
    2. doc
    3. W

    Cypher

    Declarative graph query language.
    Topics
    Created 2011

    1. h
    2. c
    3. dbdb
    4. CNCF

    Dgraph

    Distributed graph database. Uses RocksDB.
    Topics
    Data model
    Query GraphQL variant
    Implementation Go
    Created 2015

    1. h
    2. c
    3. doc
    4. dbdb

    Dqlite

    Distributed SQLite, embedded (C library), persistent RDBMS, Raft consensus.
    Topics
    Query SQL
    Implementation C
    Created 2017
    Alternatives

    1. h
    2. dbdb
    3. W

    DynamoDB (Amazon DynamoDB)

    AWS key-value and document store, auto-scaling, priced on throughput instead of storage, integration with Hadoop.
    Topics
    Data model
    Created 2012
    1. If DynamoDB sucks at sorting/querying… which database should we use? / 2020-03: Reddit discussion of DynamoDB features, strenghts/weaknesses, (general NoSQL) design considerations: use it for big data, not relational queries.
    2. DynamoDB’s new autoscaling in action for https://github.com/tj/gh-polls, low traffic either way but at least I don’t have to over-provision! / 2017-07, tweet by TJ Holowaychuk

    1. c
    2. doc
    3. dbdb
    4. W

    Elliptics

    Distributed storage, eventually consistent, distributed hash table (DHT).
    Topics
    Data model
    Implementation C++, Python, Go
    Created 2008

    1. h
    2. c
    3. doc
    4. dbdb
    5. CNCF
    6. W
    7. Tw

    FoundationDB

    Distributed fault-tolerant database, ACID, for OLTP.
    Topics
    Data model
    Implementation C++
    Created 2009

    1. h
    2. c
    3. doc
    4. dbdb
    5. SO
    6. Tw

    Geode (Apache Geode)

    Distributed in-memory data grid supporting caching and data computation.
    Topics
    Data model
    Query OQL
    Implementation Java
    Created 2002

    1. h
    2. c
    3. doc
    4. W
    5. Tw

    GlusterFS (Red Hat Gluster Storage)

    Network file system, software-defined storage.
    Topics

    1. h
    2. W

    Helix (Apache Helix)

    Cluster management framework partitioned and replicated distributed resources.
    Topics
    Implementation Java

    1. h
    2. c
    3. doc
    4. dbdb
    5. W

    Ignite (Apache Ignite)

    Distributed in-memory key-value store.
    Topics
    Data model
    Implementation Java
    Created 2014

    1. h
    2. c
    3. dbdb
    4. CNCF
    5. W
    6. Tw

    Infinispan

    Distributed cache and key-value storage, in-memory.
    Topics
    Data model
    Implementation Java
    Created 2009

    1. h
    2. c
    3. CNCF
    4. W
    5. Tw

    Kafka (Apache Kafka)

    Stream processing, cluster, middleware for message passing.
    Topics
    Implementation Java

    key-value (KV)

    Simplest storage model, no other… features? Opaque values (unstructured) and single index imply: no schema, no queries, no joins. Simplicity affords low latency, typically sub-millisecond. Typically used for caching, message queues, *pub-sub…
    Topics

    1. h
    2. c
    3. doc
    4. CNCF
    5. W
    6. Tw

    Kubernetes (Kubes, Kube, K8s)

    Automating deployment, scaling, and management of multi-container applications. Orchestration for ephemeral workloads.
    Topics
    Implementation Go
    Created 2014
    Alternatives
    1. awesome-kubernetes: curated bookmarks.
    2. 10 Most Common Reasons Kubernetes Deployments Fail (Part 1) / Ross Kukulinski, 2017-02: nice intro to Kubernetes, orchestration complexity.

    1. W

    NoSQL

    Databases: no SQL, not relational, not only SQL, NewSQL…
    Topics

    1. h
    2. c
    3. doc
    4. dbdb
    5. W
    6. SO
    7. Tw

    PouchDB

    Document database in JavaScript, designed to run in a browser, derived from/inspired by CouchDB. Browser’s local storage synchronized, when online, with server-side CouchDB or LevelDB.
    Topics
    Data model
    Implementation JavaScript
    Created 2010

    1. h
    2. c
    3. W

    Raft (Raft Consensus)

    Consensus algorithm designed as an alternative to *Paxos.
    Topics

    1. h
    2. doc

    RDS (Amazon Relational Database Service, Amazon RDS)

    Web service that makes it easier to set up, operate, and scale a relational database. Available for, eg, Aurora, PostgreSQL, MySQL, MariaDB, Oracle.
    Topics

    1. h
    2. c
    3. dbdb
    4. W

    Riak

    Distributed key-value database, pluggable backend store, supports LevelDB.
    Topics
    Data model
    Implementation Erlang
    Created 2009

    1. h
    2. c
    3. CNCF
    4. Tw

    ShardingSphere (Apache ShardingSphere)

    JDBC middleware/proxy for sharding, distributed transactions. Planned sidecar.
    Topics
    Implementation Java
    Created 2018

    1. h
    2. c

    SODA (Simple Oracle Document Access)

    Set of NoSQL-style APIs to use Oracle database as document store.
    Topics
    Data model

    1. h
    2. c
    3. dbdb
    4. W

    Solr (Apache Solr)

    Full-text search server with a REST-like API. Features include hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, and geospatial search. Uses/merged into Lucene.
    Topics
    Data model
    Implementation Java
    Created 2004

    1. doc

    Swarm (Swarm mode)

    Docker includes swarm mode for natively managing a cluster.
    Topics
    Alternatives
    1. Why I’m leaving Kubernetes for Swarm / Jonathan Kosgei, 2017-02: “Humans should not have to write/read config files”. Discussion.
    2. docker-swarm-cluster: configurations for combining several tools for creating a Docker Swarm cluster: Swarm Dashboard, Traefik, Portainer, Prometheus, Grafana…
    3. terraform-docker-swarm-aws: Terraform script to set up a Docker Swarm on AWS .

    1. h
    2. c
    3. doc
    4. dbdb

    Tarantool

    Multi-threaded database, stored procedures in Lua/C, partial SQL.
    Topics
    Data model
    Implementation C, Lua
    Created 2008

    1. h
    2. c
    3. doc
    4. dbdb
    5. CNCF
    6. Tw

    TiDB

    Distributed, hybrid transactional and analytical processing (HTAP). Uses TiKV.
    Topics
    Query SQL
    Implementation Go
    Created 2015

    1. h
    2. c
    3. doc
    4. pkg
    5. CNCF
    6. Tw

    Zenko

    Multi-cloud data management, multiple backends, file systems and S3 compatible. Collection of microservices written mostly in JavaScript.
    Topics
    Implementation JavaScript



Comments are closed.