Cloud Whatis
Dictionary of cloud native ecosystem. (Work in progress; last updated: 2021-04-28.)
Format
- I am considering converting this into an app, adding features. Blog post is a quick-and-dirty attempt at releasing "early" — this is a small sample, and mostly drafts — I've already nearly 500 entries. Look at all this red ink. Crazy landscape.
- Still thinking about RDFa, microformats… (Exposes how Q&D I did data modeling for this. And how hard taxonomies are. ;o)
- Not showing timestamps per entry, yet.
Classifications
- My attempt at a more (semi-)structured knowledge base.
- Also, to normalize (standardize?) and de-hype terminology, characterizations.
- … turned out to be an impossible task.
- Recommended resources:
- Detailed classifications of many databases done by dbdb — are linked as canonical.
- CNCF SIG Storage whitepaper gives a rudimentary taxonomy.
- The Dictionary of Cloud-Native App Delivery
Dictionary
(Reminder: search with Ctrl+F, or "/" on FF. ;o)
Major topics: NoSQL, Kubernetes, continuous, microservices, storage, Hadoop…
3Box
Decentralized secure storage. Uses IPFS, OrbitDB. Topics Data model Accumulo (Apache Accumulo)
Distributed key-value store, based on BigTable, HDFS, ZooKeeper. Topics Data model Implementation Java Created 2008 ACM (AWS Certificate Manager)
Manage SSL/TLS certificates on AWS resources. WUI. API. CLI. Topics ActiveMQ (Apache ActiveMQ)
Message broker. Topics Implementation Java Created 2004 Aerospike
SSD-optimized distributed key-value database, replicated, secondary indexes. Topics Data model Query custom API Implementation C Stored procedures Lua Created 2009 - Jepsen: Aerospike 3.99.0.3 / 2018: detailed analysis.
- Jepsen: Aerospike / 2015: rigorous analysis, especially of what ACID means, critique of version 3.5.4.
Agola
Containerized GitOps CI/CD. Topics Implementation Go Airflow (Apache Airflow)
Workflow orchestration framework, pipelines (workloads) automation/scheduling. Topics Implementation Python Akka
Toolkit for distributed applications. Topics Implementation Scala, Java Akutan (formerly Beam)
Distributed knowledge graph store — aka RDF or triple store. Topics Data model Implementation Go Created 2019 Alluxio (formerly Tachyon)
Virtual distributed file system, used as data orchestration layer, provides multiple data access interfaces, unified mounting name-space, and hierarchical cache, between cloud storage and, eg, Kubernetes workloads, Presto, Spark, Hive, Kubeflow. Multi-cloud. $ ./bin/alluxio fs mount alluxio://master:port/nfs /mnt/nfs
Topics Implementation Java Scale petabyte Created 2013 - CSI implementation to provide POSIX access to Alluxio in containerized environments such as Kubernetes
- Using Alluxio to Optimize and Improve Performance of Kubernetes-Based Deep Learning in the Cloud / 2020-05: whitepaper (PDF), detailed usecase, performance analysis.
- Improving Presto Latencies with Alluxio Data Caching / Rohit Jain, et al, 2020-06: architecture, performance benchmarks.
Ambari (Apache Ambari)
Hadoop cluster provisioning, monitoring, management. Topics Ambassador
Envoy-based API gateway? Topics AMQP (Advanced Message Queuing Protocol)
Application layer protocol for message oriented middleware: queuing, routing, point-to-point, publish-subscribe. Wire-level protocol: specifies binary data format. Topics Created 2003 Alternatives Ansible
Automation, provisioning, and orchestration tool. Agentless — uses SSH. Everything in YAML. 1,300+ integrations. Topics Implementation Python - Module Index (Categories)
- Ansible Galaxy: community contributed roles (tasks) and collections thereof.
- [rant] I feel like I'm not grokking ansible | /r/ansible / 2020-08: discussion: declarative vs programming language, maturity…
- Why you should try pyinfra | Lobsters / 2020-06: interesting discussion, hate on Ansible.
Ant (Apache Ant)
Make-like build tool. Topics Implementation Java Created 2000 AnyRPC
Multi-protocol remote procedure call C++ library; supports JSON-RPC, XML-RPC, MessagePack RPC. Topics Implementation C++ Apache Shiro
Authn/z framework. Topics Implementation Java Created 2004 API gateway
- OpenAPI Specification (formerly Swagger)
App Mesh (AWS App Mesh)
Service mesh on AWS. Topics AppSync (AWS AppSync)
Managed GraphQL service, serverless. Topics Created 2018 ArangoDB (formerly AvocadoDB)
Multi-model distributed database, JSON, in-memory with persistent backend. Joins, either multi-collections transactions for ACID, or single-document transactions for performance. Topics Data model Query AQL, JSON oriented Implementation C++, JavaScript Stored procedures JavaScript (as microservices) Created 2011 - ArangoDB vs Neo4j: detailed comparison; NB: APIs, QL, and clustering.
- ArangoDB Foxx repositories
Argo (ArgoCD)
Kubernetes native workflows, events, CI and CD. Topics Implementation Go, TypeScript Arrow (Apache Arrow)
Columnar in-memory analytics. Topics Data model Created 2016 Artifactory
Container image registry, artifact repository. Topics Implementation Ruby Ascoltatori
Pub-sub library, multiple brokers/protocols: Redis, MongoDB, MQTT, AMQP, ZeroMQ, *QlobberFSQ, Kafka. Unmaintained? Topics Implementation JavaScript Aurora (Amazon Aurora)
AWS managed (by RDS) RDBMS compatible with MySQL and PostgreSQL. Distributed, fault-tolerant, self-healing, low-latency read replicas, point-in-time recovery, continuous backup to S3, and replication across three Availability Zones. Topics Created 2014 Alternatives Auto DevOps
GitLab CI template (automatic pipelines) for: build, test, code quality, SAST, dependency scanning, license management, container scanning, review apps, DAST, deployment, and browser performance testing. Topics - Category Direction - Auto Devops: GL's plans for Auto DevOps; useful overview.
- GitLab Auto DevOps demo (video) / 2018-08
- Using GitLab Auto DevOps with Kubernetes Through Rancher's Authorized Cluster Endpoint / 2019-04: example usage.
AWS (Amazon Web Services)
Public cloud provider. Topics Created 2006 Baucis
Framework for REST APIs, based on Mongoose/MongoDB, Express/Node. HTTP methods for CRUD, full text search. Topics Implementation JavaScript Bazel
Build tool. Topics - Bazel Performance in a CI Environment / Filip Nikolovski, 2020-03: ephemeral containers (GitLab) vs caching, and GIT_CLEAN_FLAGS.
BerkeleyDB (BDB)
Embedded database, ACID. Topics Data model Query SQL, PL/SQL Created 1994 BigQuery
Query engine, serverless data warehouse, append-only tables, for interactive analysis of datasets over Google Cloud Storage. REST API, CLI tools. Integrated with Google Apps Script (ie Google Docs/Drive). Topics Data model Query SQL, data encoded as CSV or JSON Stored procedures JavaScript Created 2010 - Getting started with BigQuery - Colaboratory: Jupyter notebook on Drive.
Bitbucket Pipelines
Bitbucket's integrated CI/CD. Topics BooPickle
Serialization format, binary, efficient. Scala. Topics Implementation Scala Brigade
Event-driven scripting for Kubernetes. Topics - Kashti: web UI for Brigade pipelines.
Brooklyn (Apache Brooklyn)
Framework for modeling, deploying and managing distributed applications defined using declarative YAML blueprints. Topics Implementation Java Created 2012 Buildpacks (Cloud Native Buildpacks, CNB)
System for building OCI images. Topics Created 2011 - Intro to Cloud Native Buildpacks (video) / Terence Lee & Emily Casey, 2019-11
- Cloud Foundry Buildpacks or Dockerfiles / 2018-03
Buildr (Apache Buildr)
Build tool for Java-based applications. Topics Implementation Ruby Calcite (Apache Calcite)
SQL parser, relational algebra API, query planning engine. Topics Camel (Apache Camel)
Message routing framework for Enterprise Integration Patterns. Topics - Components reference: 350+ components.
- Cf suggested introductions
Cassandra (Apache Cassandra)
Distributed database, (wide-) columnar, replicated (across data centers), tunable consistency. Hadoop integration. LSM-Trees. Topics Data model Query CQL Implementation Java Scale petabyte Created 2008 - 2012 in review: Performance / Jonathan Ellis: compares architecture/features with MongoDB, Riak, HBase.
- Cassandra and Solid State Drives (slides) / Rick Branson, 2012
Cayenne (Apache Cayenne)
ORM framework. Topics CBOR (Concise Binary Object Representation)
Serialization format, binary, schemaless. IETF RFC 8949, et al: signing, encryption, web tokens, data definition language. Topics CFEngine
Topics Implementation C Created 1993 ChartMuseum
Helm chart repository server. Topics Chef
Topics Implementation Ruby Created 2009 Cilium
Container Networking Interface (CNI) plugin based on eBPF. Topics ClickHouse
Columnar database for OLAP. No transactions. Topics Data model Query SQL-like Implementation C++ Scale petabyte Created 2009 CloudEvents
Specification for event data standard. Topics - CloudEvents Primer
- Introduction to CloudEvents — v1.0 and Beyond (video) / Doug Davis, Clemens Vasters, 2020-09 (English; first 20 seconds in Chinese): introduction, discovery/subscription, schema registry.
Cloudflare Workers
Topics CloudFormation (AWS CloudFormation, CFN)
AWS provisioning templates, *IaC, YAML or JSON. Topics Cluster API (CAPI)
Declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. Kubernetes sub-project started by CNCF SIG Cluster Lifecycle. Topics CNAB (Cloud Native Application Bundles)
Topics CNI (Container Network Interface)
Networking for Linux containers. Topics CockroachDB (CRDB)
Distributed database, OLTP, PostgreSQL compatible. Topics Data model Query SQL Created 2015 Alternatives - How to Choose Between PostgreSQL and CockroachDB / Steve Croce, 2019-08-15
Codefresh
Hosted CI/CD, GitOps, Kubernetes based. Topics Compojure
Small routing library for Ring. Topics Implementation Clojure Concourse
Pipelines automation, containerized. YAML configurations. Topics Implementation Go, Elm Conjur (CyberArk Conjur)
Secrets management for microservices. RBAC. CLI, REST. Uses Nginx, Postgres. Topics Implementation Ruby containerization
continuous (CI/CD: continuous integration/delivery/deployment)
Automation… - Continuous Delivery Foundation (CDF); @CDeliveryFdn
- CD tools drawbacks / discussion on Reddit /r/devops, 2019-04: *Spinnaker, *GitLab, *Helm…
- CNCF SIG Application Delivery Charter: scope and topics of the lifecycle of cloud-native applications.
- Model of Cloud-Native App Delivery | Google Slides
Cortex
Prometheus metrics storage and query. Topics Created 2016 Couchbase (Couchbase Server, formerly Membase)
Distributed (shared-nothing) database, JSON document oriented, in-memory, swapped to disk, append-only. Topics Data model Indexes hash, B+ tree, full text Query N1QL (SQL extended for JSON) Implementation Erlang Created 2009 CouchDB (Apache CouchDB, Cluster Of Unreliable Commodity Hardware)
Document database. Topics Data model Query REST API Implementation Erlang Created 2005 CrateDB
Distributed (shared-nothing) SQL database, document oriented. Components from Presto, Lucene, Elasticsearch and *Netty. Topics Data model Query SQL Implementation Java Created 2014 CRD (Custom Resource Definition)
Kubernetes API extensions to manage resources other than native objects such as pods and services. Topics - Kubernetes docs: Extending Kubernetes: Custom Resources
- doc.crds.dev: documentation browser for CRDs.
CRI-O (neé OCID)
implementation of the Kubernetes Container Runtime Interface (*CRI) designed to enable the use of Open Container Initiative (OCI) compatible runtime — allows Kublet to use different container runtimes, without needing to recompile Kubernetes. Topics Created 2016 Crossplane
Kubernetes add-on that extends any cluster with the ability to provision and manage cloud infrastructure, services, and applications using kubectl, GitOps, or any tool that works with the Kubernetes API. Topics Implementation Go - Crossplane vs Cloud Provider Infrastructure Addons / Nic Cope, 2021-02
- Is Crossplane the Infrastructure LLVM? / Daniel Mangum, 2021-03
Crux
Document database with bitemporal graph queries (indexing of schemaless documents), layer over LMDB, RocksDB, or Kafka. Topics Query EDM, Datalog Implementation Clojure Created 2019 - A Bitemporal tale – History. Of histories. / 2019-07-16: code example (Clojure interactive notebook?).
CSI (Container Storage Interface)
Topics - Kubernetes Documentation: Concepts: Storage
- Kubernetes Storage Lingo 101 (video) / Saad Ali, KubeCon 2018
Cypher
Declarative graph query language. Topics Created 2011 data lake
- Cloud Data Lake vs On-Premises Data Lake / Eran Levy, 2019-03-18
data model
Topics Delta Lake
Storage layer over data lake (S3, Hadoop), provides ACID, data warehouse features, using Spark. Topics - Presto and Athena to Delta Lake integration / 2020-07-06
Dgraph
Distributed graph database. Uses RocksDB. Topics Data model Query GraphQL variant Implementation Go Created 2015 Distroless
Build Docker images without OS stuff, just your app. Topics Implementation Starlark (Bazel) DocumentDB (Amazon DocumentDB)
Managed MongoDB-compatible database on AWS. Topics Data model Alternatives Dqlite
Distributed SQLite, embedded (C library), persistent RDBMS, Raft consensus. Topics Query SQL Implementation C Created 2017 Alternatives Dragonfly
File distribution system. Topics Implementation Go Created 2015 - CNCF to host Dragonfly, a cloud-native file distribution system for Kubernetes / Mike Wheatley, 2020-04
Drill (Apache Drill)
Distributed SQL query engine. Schema-less. Topics Data model Query SQL Implementation Java Created 2012 Drone
Container-native CD: uses YAML — a superset of Docker Compose — to define and execute pipelines inside containers. Topics DRPC
Lightweight replacement for gRPC. Topics Implementation Go - Introducing DRPC: Our Replacement for gRPC / JT Olio, Jeff Wendling, 2021-04
Druid (Apache Druid)
Distributed real-time analytics, uses ZooKeeper, HDFS and a relational database for metadata. Topics Data model Query SQL, REST/HTTP Created 2011 DuckDB
Embedded database for OLAP. ACID. Topics Data model Query SQL Implementation C++ Created 2018 Alternatives Dynamo
Highly available distributed key-value storage. Topics Created 2004 DynamoDB (Amazon DynamoDB)
AWS key-value and document store, auto-scaling, priced on throughput instead of storage, integration with Hadoop. Topics Data model Created 2012 - If DynamoDB sucks at sorting/querying… which database should we use? / 2020-03: Reddit discussion of DynamoDB features, strenghts/weaknesses, (general NoSQL) design considerations: use it for big data, not relational queries.
- DynamoDB's new autoscaling in action for https://github.com/tj/gh-polls, low traffic either way but at least I don't have to over-provision! / 2017-07, tweet by TJ Holowaychuk