proposals: add "etcd" as "Incubating" CNCF project #143

gyuho · 2018-08-08T19:05:04Z

ref. #136.

/cc @erinboyd @philips @xiang90 @bgrant0607 @jpbetz @d-nishi

gyuho · 2018-08-08T19:06:10Z

/cc @caniszczyk

ghost

+1. I'm fully in support of etcd at incubation.

I added some minor comments that might improve the document. Definitely not blockers though.

ghost · 2018-08-08T19:59:53Z

proposals/etcd.adoc

+
+*Description*:
+
+etcd is a consistent distributed key-value store, designed to hold small amounts of data that can fit entirely in memory (although etcd still writes to disk for durabilities) and mainly used as a separate coordination service for other distributed systems like Kubernetes. Typical etcd cluster is distributed over 3 to 5 nodes, for high availability, while it prioritizes consistency and partition tolerance. Which means, it provides on logical cluster view of many physical servers. So long as majority of cluster is up, etcd continues to work, even under machine failures. This redundancy provides fault tolerance.


nit: durability. Also, wrapping lines around 80-120 chars makes it easier to comment.

nits:
A typical etcd cluster...
a majority.

ghost · 2018-08-08T20:05:21Z

proposals/etcd.adoc

+
+etcd is a consistent distributed key-value store, designed to hold small amounts of data that can fit entirely in memory (although etcd still writes to disk for durabilities) and mainly used as a separate coordination service for other distributed systems like Kubernetes. Typical etcd cluster is distributed over 3 to 5 nodes, for high availability, while it prioritizes consistency and partition tolerance. Which means, it provides on logical cluster view of many physical servers. So long as majority of cluster is up, etcd continues to work, even under machine failures. This redundancy provides fault tolerance.
+
+etcd server implements Raft consensus algorithm for data replication. Raft is a leader-based protocol. Data is replicated from leader to follower; follower forwards proposals to leader, and leader decides what to commit or not. Leader persists and replicates an entry, once it has been agreed by the quorum of cluster. The underlying storage layer for Raft log is write-ahead log (WAL). Committed entries are written out to disk, so they can be replayed on restart. etcd uses gRPC for transport layer. Client employs HTTP/2 Ping for server health checking, and implements automatic failover under faulty networks.


Perhaps worth noting explicitly that leader-based replication does not provide horizontal write scalability. Write (and consistent read) throughput is limited to durable write throughput of a single leader node.

ghost · 2018-08-08T20:10:57Z

proposals/etcd.adoc

+
+*External dependencies*: https://github.com/coreos/etcd/blob/master/bill-of-materials.json
+
+*Statement on alignment with CNCF mission*: etcd has enabled adoption of cloud native systems: etcd is "container packaged" by publishing container image for every release, and etcd is a critical component and most reliable storage implementation to many "dynamically managed" systems like Kubernetes.


nit: This paragraph could do with some word-smithing. "has enabled adoption" and "most reliable" are strong claims, and unnecessary, IMO, to justify alignment with the CNCF's mission:

https://www.cncf.io/about/charter/

philips · 2018-08-08T20:39:14Z

overall looks reasonable. I agree with @quinton-hoole's nits.

gyuho · 2018-08-08T20:57:39Z

Address @quinton-hoole's feedback. Thanks!

caniszczyk · 2018-08-08T23:46:22Z

RFC @cncf/toc - will leave this open for a week or so for community feedback

philips · 2018-08-09T17:26:57Z

We need to add own, monitor, manage, and maintain discovery.etcd.io to this list as well. etcd-io/etcd#9965 (comment)

gyuho · 2018-08-09T17:46:52Z

@philips That makes sense, since we are donating etcd.io as well.

bgrant0607 · 2018-08-13T20:23:30Z

proposals/etcd.adoc

+
+*Issue tracker*: https://github.com/coreos/etcd/issues
+
+*Initial committers*: https://github.com/philips[Brandon Phillips] and


If the full list of committers/maintainers is in https://github.com/coreos/etcd/blob/master/MAINTAINERS, then I suggest we remove the 2 specific individuals from here. There are active maintainers from at least Redhat (CoreOS), Amazon, and Google, correct?

I think there is a misunderstanding of what "Initial Committers" means. Does it mean the people who were initially on the project when it started or the people who will be on the project when donated to CNCF?

For context Xiang and I started the project 5 years ago so I assume that is why we are listed here.

@bgrant0607 @philips I took this from https://github.com/cncf/toc/blob/master/proposals/grpc.adoc, where initial "git" committers are listed as "Initial committers" (ref. https://github.com/grpc/grpc-java/graphs/contributors?from=2014-05-04&to=2014-09-30&type=c).

Sorry, this may not be all that important, but perhaps @caniszczyk has guidance.

@caniszczyk Shall we remove, if this reads confusing?

@gyuho this means who has commit access to the project when donated, essentially we want to see who has write access, how diverse the committership is etc

@caniszczyk @bgrant0607 Updated to only list maintainers who have write access to etcd repository. Look forward to expanding more!

bgrant0607 · 2018-08-13T20:25:47Z

proposals/etcd.adoc

+
+*External dependencies*: https://github.com/coreos/etcd/blob/master/bill-of-materials.json
+
+*Statement on alignment with CNCF mission*: etcd helps adopt cloud native


Maybe @philips could help improve the wording here

@bgrant0607 how about this?

A consistent and partition tolerant datastore, like etcd, is a base dependency for many cloud native architectures. They hold on to critical cluster configuration and locks while providing guarantees against individual machine failure, network partitions, or data center power loss. In the literature and ecosystem etcd, or similar systems (e.g. chubby, zookeeper, etc), provide the persistence layer for applications like Kubernetes, CoreDNS, Vitess, Borg, Mesos, and countless others.

LGTM, thanks

@philips @bgrant0607 Just updated this part. Thanks!

caniszczyk · 2018-08-13T21:00:36Z

Another question, are you considering making the https://github.com/coreos/etcd-operator part of this contribution?

philips · 2018-08-13T21:22:17Z

etcd Operator has not been discussed and isn't part of the proposal so far.

…

On Mon, Aug 13, 2018, 2:00 PM Chris Aniszczyk ***@***.***> wrote: Another question, are you considering making the https://github.com/coreos/etcd-operator part of this contribution? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#143 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AACDCFK5rOHmOZ3pJkWujtGz8iEGk4VTks5uQej_gaJpZM4V0fgI> .

bgrant0607 · 2018-08-21T14:39:40Z

@gyuho Is it correct that etcd has just one repo now? It looks like go-etcd, etcdctl, and etcd-ca repos are deprecated?

Is discovery.etcd.io still in use?

gyuho · 2018-08-21T14:52:36Z

@bgrant0607

go-etcd, etcdctl, and etcd-ca repos are deprecated?

Yes. They were deprecated.

All core etcd components (package raft, "etcd" command, "etcdctl" command) are in one https://github.com/coreos/etcd repository.

Is discovery.etcd.io still in use?

Some people still use it. @philips should have more data on this.

jpbetz · 2018-08-21T20:16:48Z

proposals/etcd.adoc

+See https://github.com/coreos/etcd/blob/master/Documentation/production-users.md[etcd production users] for more.
+
+Integrations: Kubernetes API server persists cluster metadata in etcd.
+OpenStack uses etcd to keep track service liveness. CockroachDB, TiKV, and


"keep track of"

Fixed, thanks @jpbetz !

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

philips · 2018-08-22T18:39:43Z

@bgrant0607 discovery.etcd.io is still in extremely wide use and is used by default in OpenStack spin-up scripts. We had an outage recently on that service and dozens of people noticed on twitter, github, etc.

jonboulle · 2018-09-24T12:55:49Z

@caniszczyk where did this end up?

caniszczyk · 2018-09-24T13:17:25Z

@jonboulle voting is open: https://lists.cncf.io/g/cncf-toc/message/2237

The project team is still finalizing on when they would like to make the announcement, our approach has been to work with the project communities on their preferred time of announcement (there are enough votes for this to technically pass)

caniszczyk · 2018-12-11T16:12:55Z

+1 binding TOC votes (8/9):

Quinton: https://lists.cncf.io/g/cncf-toc/message/2259
Camille: https://lists.cncf.io/g/cncf-toc/message/2262
Brian: https://lists.cncf.io/g/cncf-toc/message/2268
Ben: https://lists.cncf.io/g/cncf-toc/message/2270
Sam: https://lists.cncf.io/g/cncf-toc/message/2271
Alexis: https://lists.cncf.io/g/cncf-toc/message/2278
Ken: https://lists.cncf.io/g/cncf-toc/message/2308
Jon: https://lists.cncf.io/g/cncf-toc/message/2431

Welcome etcd to the CNCF community :)

caniszczyk changed the title ~~proposals: add "etcd" as "Incubating" project~~ proposals: add "etcd" as "Incubating" CNCF project Aug 8, 2018

ghost reviewed Aug 8, 2018

View reviewed changes

bgrant0607 reviewed Aug 13, 2018

View reviewed changes

jpbetz reviewed Aug 21, 2018

View reviewed changes

proposals: add "etcd" as "Incubating" project

0f53ac5

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

hexfusion mentioned this pull request Aug 23, 2018

build-docker: Added support for s390x etcd-io/etcd#10029

Closed

caniszczyk added this to In progress (due diligence) in TOC Project Backlog Oct 10, 2018

caniszczyk self-assigned this Oct 16, 2018

caniszczyk added the incubation label Oct 16, 2018

caniszczyk moved this from In progress (due diligence) to TOC Approved (sponsors/voting) in TOC Project Backlog Nov 15, 2018

caniszczyk merged commit b187d38 into cncf:master Dec 11, 2018

gyuho mentioned this pull request Dec 11, 2018

blog: add etcd post for December 11, 2018 kubernetes/website#11524

Merged

caniszczyk moved this from TOC Approved (sponsors/voting) to Done in TOC Project Backlog Dec 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposals: add "etcd" as "Incubating" CNCF project #143

proposals: add "etcd" as "Incubating" CNCF project #143

gyuho commented Aug 8, 2018

gyuho commented Aug 8, 2018

ghost left a comment

ghost Aug 8, 2018

ghost Aug 8, 2018

ghost Aug 8, 2018

ghost Aug 8, 2018

philips commented Aug 8, 2018

gyuho commented Aug 8, 2018

caniszczyk commented Aug 8, 2018

philips commented Aug 9, 2018

gyuho commented Aug 9, 2018

bgrant0607 Aug 13, 2018

philips Aug 13, 2018

gyuho Aug 14, 2018

bgrant0607 Aug 17, 2018

gyuho Aug 18, 2018 •

edited

caniszczyk Aug 20, 2018

gyuho Aug 20, 2018

bgrant0607 Aug 13, 2018

philips Aug 13, 2018

bgrant0607 Aug 21, 2018

gyuho Aug 21, 2018

caniszczyk commented Aug 13, 2018

philips commented Aug 13, 2018 via email

bgrant0607 commented Aug 21, 2018

gyuho commented Aug 21, 2018 •

edited

jpbetz Aug 21, 2018

gyuho Aug 21, 2018

philips commented Aug 22, 2018

jonboulle commented Sep 24, 2018

caniszczyk commented Sep 24, 2018

caniszczyk commented Dec 11, 2018


		Description:

		etcd is a consistent distributed key-value store, designed to hold small amounts of data that can fit entirely in memory (although etcd still writes to disk for durabilities) and mainly used as a separate coordination service for other distributed systems like Kubernetes. Typical etcd cluster is distributed over 3 to 5 nodes, for high availability, while it prioritizes consistency and partition tolerance. Which means, it provides on logical cluster view of many physical servers. So long as majority of cluster is up, etcd continues to work, even under machine failures. This redundancy provides fault tolerance.


		etcd is a consistent distributed key-value store, designed to hold small amounts of data that can fit entirely in memory (although etcd still writes to disk for durabilities) and mainly used as a separate coordination service for other distributed systems like Kubernetes. Typical etcd cluster is distributed over 3 to 5 nodes, for high availability, while it prioritizes consistency and partition tolerance. Which means, it provides on logical cluster view of many physical servers. So long as majority of cluster is up, etcd continues to work, even under machine failures. This redundancy provides fault tolerance.

		etcd server implements Raft consensus algorithm for data replication. Raft is a leader-based protocol. Data is replicated from leader to follower; follower forwards proposals to leader, and leader decides what to commit or not. Leader persists and replicates an entry, once it has been agreed by the quorum of cluster. The underlying storage layer for Raft log is write-ahead log (WAL). Committed entries are written out to disk, so they can be replayed on restart. etcd uses gRPC for transport layer. Client employs HTTP/2 Ping for server health checking, and implements automatic failover under faulty networks.


		External dependencies: https://github.com/coreos/etcd/blob/master/bill-of-materials.json

		Statement on alignment with CNCF mission: etcd has enabled adoption of cloud native systems: etcd is "container packaged" by publishing container image for every release, and etcd is a critical component and most reliable storage implementation to many "dynamically managed" systems like Kubernetes.


		Issue tracker: https://github.com/coreos/etcd/issues

		Initial committers: https://github.com/philips[Brandon Phillips] and

proposals: add "etcd" as "Incubating" CNCF project #143

proposals: add "etcd" as "Incubating" CNCF project #143

Conversation

gyuho commented Aug 8, 2018

gyuho commented Aug 8, 2018

ghost left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philips commented Aug 8, 2018

gyuho commented Aug 8, 2018

caniszczyk commented Aug 8, 2018

philips commented Aug 9, 2018

gyuho commented Aug 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gyuho Aug 18, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caniszczyk commented Aug 13, 2018

philips commented Aug 13, 2018 via email

bgrant0607 commented Aug 21, 2018

gyuho commented Aug 21, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philips commented Aug 22, 2018

jonboulle commented Sep 24, 2018

caniszczyk commented Sep 24, 2018

caniszczyk commented Dec 11, 2018

gyuho Aug 18, 2018 •

edited

gyuho commented Aug 21, 2018 •

edited