Releases: futurewei-cloud/alcor
v1.1.0-beta
Release Summary
This release focuses on enabling the compatibility with gateway platform Arion while enhancing messaging mechanism and improving Alcor fundamental.
New Arion-related Features Development
- Add Alcor vpc gateway goalstate to support ACA (compute node) default routing rule to Arion gateway (PR #748 part 1)
- Support consistent hashing (based on VNI and dest subnet) to different Arion gateway group for traffic partitioning and balancing (PR #748 part 2)
MQ-related Feature Development
- Add gRPC/MQ auto switch mechanism ([PR #742])
- Remove hostIps hashset from MulticastGoalstateV2 ([PR #741])
Fundamental Improvement
v1.0-beta
Release Summary
This release focuses solely on performance and scalability enhancement on agent, controller and end-to-end workflow.
Some highlight of the v1.0 release:
- Redesign ACA orchestration layer with high-performance threading model, and achieve 75% latency reduction for one-million ovs flows programming.
- Improve port/subnet/vpc api throughout by 5X compared to last release, and achieve up to 30x port throughout gain compared to Neutron at performance tipping point.
- Measure single host on-demand throughput up to 300K request/second with enhanced Alcor benchmarking framework based on CBench.
- End-to-end integrate new Goal State V2 message with SDN southbound messaging workflow
- Redesign cache/db schema in multiple microservices to remove chock point and support higher concurrency
Alcor components
- Regional control plane: https://github.com/futurewei-cloud/alcor
- Host control agent: https://github.com/futurewei-cloud/alcor-control-agent
- Python client: https://github.com/futurewei-cloud/python-alcorclient
- Performance and scalability test
- Perf summary: https://github.com/futurewei-cloud/alcor-perf
- Rally framework (SDN northbound top-down test): https://github.com/futurewei-cloud/rally-openstack-alcor
- CBench framework (SDN southbound bottom-up test): https://github.com/futurewei-cloud/alcor_oflops
- E2E integration: https://github.com/futurewei-cloud/alcor-int
- Meeting note: https://github.com/futurewei-cloud/alcor-meeting
Features Added
New Features Development
- Alcor Control Agent orchestration layer v2.0 Design & Development
- New orchestration layer to enhance scheduling of programming task to data plane (Agent PR #275)
- ACA threading model redesign to support high concurrency (Perf report)
- Evaluation of high performance task framework and integrate CBench with ACA (Oflops PR #1)
- Add finish call for gRPC server (Agent PR #272)
- Alcor Goal State V2 E2E integration
- Data Plane Manager supports GSv2 for unicast and multicast (Controller PRs #625, #699)
- Network Configuration Manager supports new GSv2 (Controller PR #704)
- Alcor Control Agent supports routing rule update with GSv2 (Agent PR #267)
- Make GS version configurable and support backward compatibility to GSv1 (Controller PR #718)
Alcor Performance & Scalability
- Performance/Scalability report and plan
- NCM stress test report (Perf PR #13)
- ACA threading model and task scheduler engine (Perf report)
- Microservice Performance Design & Improvement
- Improve caching performance for port/ip/vpc managers and common libraries (Controller PR #690)
- Refactor db/cache codes in DPM to support GSv2 (Controller PRs #713)
- Support SQL field query in Alcor (Controller PR #703)
- Ignite watch feature test framework (Perf PR #11)
- Reduce distribute lock and improve concurrency for Ip Manager ipAddrRangeCache (Controller PR #702)
- DPM reduces message redundancy to NCM (Controller PR #734)
- Test control enhancement with new configuration options (Controller PR #694)
- Support coexistence of gRPC and REST APIs (Controller PR #706)
- Message Queue scale path improvement
- Enable pulsar client support VPC-topic mode based on GSv1 and GSv2 (Controller PR #695)
- Add rollback mechanism for failure of sending TopicInfo (Controller PR #726)
- Map vpcId to hostIp in MulticastGoalStateV2 (Controller PR #728)
- Add listeners for multicast and unicast consumers (Agent PR #268)
Alcor Fundamental
Alcor v0.19-alpha
Release Summary
This release focuses solely on performance and scalability improvement on agent, controller and agent-controller communication.
Some highlight of the v0.19-alpha release:
- Deliver ACA ovs-driver v2.0 with 100-1000x latency and scalability improvement to support one-million ports per VPC.
- Reduce latency of new on-demand workflows by 70% with various optimizations on messaging and multi-threading modes.
- Improve port api throughout by 60% and achieve up to 16x throughout gain compared to Neutron at performance tipping point.
- Deliver routing policy feature and complete e2e integration.
- Onboard Alcor distributed tracing framework enabled by Jaeger/Opentracing for fine-grain performance profiling.
- Introduce daily Jenkins jobs to automate build, deployment and e2e testing (for Alcor and OVS data plane).
- Stabilize code and fix 10+ functionality bugs & 20+ performance bugs.
Features Added
New Features Development
- Alcor Control Agent ovs-driver v2.0 Design & Development
- New diver communication layer to enhance openflow connection/flow control performance (Agent PRs #261)
- Host ip lookup optimization (Agent PR #258)
- Batching and parallel resource state processing for one million OVS flows (Agent PR #264)
- Alcor routing policy and feature E2E integration
- Route Manager supports routing rule update (Controller PR #624 )
- Data Plane Manager supports routing rule update (Controller PR #664)
- Alcor Control Agent supports OVS programming of routing rules (Agent PRs #244, #247, #254, #257)
- API GW onboards Nova-compatible route APIs (Controller PRs #676, #677)
- Alcor Python client and Rally plugin to supports new API perf test (Client PR #1, Alcor Rally PR #1)
- Alcor distributed tracing framework
Alcor Performance & Scalability
- Performance/Scalability report and plan
- gRPC performance analysis and channel optimization for Alcor on-demand workflow (Perf PR #7)
- ACA v2.0 ovs driver refactoring perf report (Perf PR #8)
- ACA goal state processing report for massive L2/L3 neighbor (Perf PR #9)
- Alcor performance test plan (Controller PR #643)
- Controller Performance Design & Improvement
- Design on Alcor database/cache transactions (Controller PR #611)
- Optimize VPC creation of default segment table from 10s to 1s (Controller PR #642)
- Remove RM transactions for a single cache operation (Controller PR #658)
- Optimize cache write latency in NCM to sub-second and reduce up to 90% latency (Controller PR #659)
- Transaction atomicity mode change for microservices (Controller PR #673)
- Fix race condition for Subnet and Port APIs (Controller PR #685)
- Messaging performance improvement
- gRPC channel/stub pool and channel keepalive (Controller PR #637)
- gRPC channel warm-up (Controller PR #670)
- Async gRPC client, cleanup thread and intelligent sleep (Agent PR #241)
- Async gRPC server implementation and thread pool for goal state push (Agent PR #255)
- Adjustable thread pool size based on host core number (Agent PR #256)
- [Backward compatibility] async grpc server gsv1 support (Agent PR #260)
- Message Queue scale path improvement
Alcor Fundamental
- Alcor DevOps and CI/CD enhancement
- Alcor K8s deployment improvement (Controller PRs #622, #672, #675)
Stabilization and Bug fix
Alcor v0.14-alpha
Release Summary
This release focuses on enabling new workflow, fundamental improvement and code stabilization. Some highlight of the v0.14-alpha release:
- Deliver 2 new microservices & refactor 1 existing microservice for performance gain
- Proof of concept for new on-demand workflow with Network Configuration manager and ACA on-demand engine
- Introduce SDN gateway manager and finish E2E integration with Cloud Zeta gateway
- Onboard new Goal State V2 for messaging performance improvement with various refined protobuf message changes
- Streamline build, test and deployment pipeline and support integrated testing for both control and data plane
- Code stabilization with 20+ bug fix
Features Added
New Features Development
- New MicroService design & development
- Alcor Control Agent Design & Development
- On-demand engine (Agent PRs #226, #241)
- On-demand ARP responder (Agent PR #167)
- Refactor packet parser (Agent PR #225)
- Feature E2E Integration
- On-demand workflow with test controller (Controller PR #560)
- Alcor integration with Zeta Gateway (Controller PRs #543, #545, #547, and Agent PRs #158, #172, #179, #184, #191, #192, #193, #196, #198, #199, #202, #204, #206, #207, #208)
- Node and NCM registration APIs (Controller PRs #563, #572, #574, #581, #582, #586, #596, #598)
Alcor Performance & Scalability
- Scalability Design & Improvement
- Private IP Manager improvement with IP address replacement support (Controller PR #520)
- Agent performance improvement with massive L2 neighbor handling (Agent PR #176)
- Leverage ovs_control function (Agent PR #186)
- Messaging performance improvement
- Performance report on Apache ignite 2.9.1 vs Postgres 13.1 for Alcor (Perf PR #3)
- Message Queue scale path design improvement (Controller PRs #532, #544, #550, #594)
Alcor Fundamental
- Transaction atomicity mode change (Controller PR #604)
- Alcor DevOps and CI/CD enhancement
- K8s deployment process improvement (Controller PRs #528, #529, #538, #581)
- Travis test ova fix (Agent PR #215)
- MIT license update (Controller PRs #591, #592, #595, #597, and Agent PRs #234, #235, #236)
Stabilization and Bug fix
Alcor v0.10-alpha release
Release Summary
This release focuses on enabling new workflow, performance profiling and scalability testing. Some highlight of the v0.10-alpha release:
- Deliver 2 new microservices & refactor 2 existing microservices for performance gain
- Complete E2E integration for L3 routing and DHCP programming with OpenStack Nova, KeyStone, Horizon UI and CLI
- Enable MQ scaling path for large-scale customer scenario
- Compare performance with Neutron and demonstrate at least 10x performance gain on latency and 20x throughput and concurrency
- Set up scalability test framework for 1 million simulated compute nodes and collect preliminary scalability test results for both gRPC fast path and MQ scaling path
Features Added
MicroService Design & Development
- NACL Manager (PR #251)
- Quota Manager (PR #359, #387, #391)
- Route Manager v2.0 (PR #361, #380, #382, #385, #397, #495)
- Data-Plane Manager v2.0 (PR #389, #423, #427, #472, #486)
- Enable MQ scaling path (PR #481, ACA PR #133)
Release of New APIs
- NACL CURD (API spec)
- Router CURD (API spec)
- Routing table and rules CURD (API spec)
- Route & router interface binding/unbinding (API spec)
- Quota CURD (PR #359)
- New Admin API to pre-create VxLan/GRE ranges (PR #373)
E2E Integration of Key Workflows
- DHCP programming (ACA PR #164)
- Port L3 routing (PR #366, #382, #393, #396, #418, #445, #446, & ACA PR #128)
- New subnet-scope programming path (PR #429, #468)
Alcor Control Agent Design & Development
- Security group Host Design (PR #390)
- Implementation of a distributed on-host DHCP & DNS component (ACA PR #102, #136, #164)
- gRPC streaming server implementation (ACA PR #124)
- Support add-flow, del-flows, mod-flows, and dump-flows functions (ACA PR #137)
- Initial design of Elastic IP and SNAT on the host
- Port & Neighbor deletion support on host (ACA PR #166)
- Implementation of Pulsar ACA client (ACA PR #133)
Alcor Performance & Scalability
- Performance reports on E2E latency, throughput and concurrency
- Scalability test framework of 1M simulated nodes
- gRPC fast path scalability report
- MQ scaling path scalability report
- MQ system performance comparison report
- Port Manager performance profiling and report
- Mac Manager performance profiling and report
Integration with SDN Gateways
- High-level design to support SDN Gateways (Design doc)
- Alcor integration with Zeta Gateway (Note: work delivered in collaboration with Zeta)
- High-level design on Alcor & Zeta integration (ACA PR #151, #157)
- Basic gateway communication framework in ACA (ACA PR #158, #172)
Fundamental
- DPM UT test automation (PR #394, #398, #458)
- Increase thread pool size to improve concurrency (PR #401)
- Upgrade Ignite to latest 2.9.0 (PR #503)
- Enhance DB query with project Id validation (PR #502)
- Sort out Controller-Agent Contract (PR #362, #381, #467, #479, #484)
- Consolidate host related entity classes (PR #437, #421)
- Deployment scripts and K8s yaml for controller (PR #365)
- ACA unit test breakout (ACA PR #153)
- ACA deployment script enhancement (ACA PR #140, #143, #146)
Stabilization and Bug fix
v0.8-alpha
Release Summary
This release focuses on microservice implementation for Alcor control plane, E2E integration with OpenStack, and performance tuning and comparison with Neutron. Some highlight of this release:
- Deliver 8 new microservices along with customer-facing APIs
- Enable gRPC fast path to achieve sub-second port provisioning latency for latency-sensitive applications
- Complete E2E integration with OpenStack Nova, KeyStone, Horizon UI and CLI
- Compare performance with Neutron and demonstrate at least 10x performance gain on latency, throughput and concurrency
- Migration plan from Neutron to Alcor is in place
Features Added
New MicroService Design & Development
- Subnet Manager (PR #154, #174, #334)
- Port Manager (PR #180, #197, #208, #248, #271)
- Private IP Manager (PR #164, #272)
- Virtual Mac Manager (PR #147, #183, #188, #206)
- Security Group Manager (PR #208)
- Elastic IP Manager (PR #215, #243, #298, #347)
- Data Plane Manager (PR #234, #240, #254, #355)
- Node Metadata Manager (PR #189, #249, #326, #340)
Release of New APIs
- Subnet CURD (API spec)
- Port CURD and bulk operation (API spec)
- Security group and security group rule CURD (API spec)
- Port & security group binding/unbinding (API spec)
- Segment & segment range CURD (PR #181, API spec)
- Network Ip availability and usage stats (PR #353, API spec)
- Elastic IP management (PR #243, API spec)
- Port & EIP association/disassociation (PR #298, API spec)
Alcor Common Libraries
- Create common modules including AlcorLib, AlcorWeb, and AlcorSchema (PR #195, #220)
- Improve Ignite database/cache usage and support SQL-alike query (PR #208)
- Add common async executor to AlcorLib (PR #180)
- Add common microservice rest clients to AlcorWeb (PR #180)
- Support service rollback when create/update operation fail (PR #180)
- Support multi-param query (PR #252, #326, #356)
- Cache layer supports setting of expire time (PR #219, #342)
Alcor Control Agent Design & Development
- High-level design of OVS and ACA integration (ACA PR 94)
- L2 port programming with OVS (ACA PR 105)
- Local Vlan manager implementation (ACA PR 109)
- Tracking and processing of OVS packet-in and packet-out (ACA PR 112)
- Design of a two-layer distributed DHCP component (ACA PR 103)
- Design and implementation of L3 distributed routing for inter-subnet VM to VM communication (PR #339, ACA PR 119)
Performance tuning and E2E performance test
- Optimize mac allocation algorithm (#292, #317)
- Port Manager 2.0 (PR #301)
- VPC code refactor for latency improvement (PR #357)
- Alcor performance annotation (PR #276)
- Add time cost estimation for all microservices (PR #350)
- Release preliminary performance comparison results
Compatibility and Integration with OpenStack
- Integration with Nova (PR #202, #204)
- Integration with Horizon UI (PR #322)
- Compatible with OpenStack CLI
- Migration plan for existing Neutron clients (PR #205)
Security
Fundamental
- Add Code coverage report for Alcor controller repository (PR #241, #247)
- UT coverage improvement for existing microservices (PR #174)
- Support Swagger API document and Swagger UI (PR #184, #187, #313)
- Enable Ignite for all microservices (#337)
- Add Ignite service mock in UTs (PR #201)
- Standardize service pom and remove unused dependencies (PR #198)
- Upgrade Alcor Project to Use OpenJDK 11 LTS (#266)
- Design Documentation Improvement with Antora (PR #210)
- Logging enhancement (PR #327, #329)
Deployment, Monitoring and CI/CD
- Deploy controller with Kubernetes (PR #222, #290, #291, #325)
- Enable CI/CD workflow for Alcor controller repository (PR #173, #186)
- Alcor Onebot setup for development and deployment (#306)
- Alcor Monitoring with NetData
Stabilization and Bug fix
Alcor v0.3-alpha release
Release summary
This release focuses on the microservice design and implementation for Alcor control plane. It introduces implementation of three microservices including VPC manager, route manager and API gateway service. It also implements customer facing VPC operation APIs integrated with the new microservice framework.
Features Added
Alcor Controller
- API gateway (PR #143)
- VPC Manager (PR #134)
- Route Manager (PR #144)
- VPC operation workflow design & implementation (PR #134, #143, #144)
- Database module with Apache Ignite in-memory cache support (PR #129)
- Build and deployment enhancement (PR #118, #121))
- Controller logging improvement (PR #107, #116, #120)
Alcor Control Agent
Document
- Overall microservice architecture design & API workflow for vpc/subnet/port
- Network config data model (PR #106)
- Control plane API document v0.1 (PR #123)
Alcor v0.1-alpha release
Features:
a. MVP VPC features to manage Mizar data plane
b. Controller Batch API for throughput optimization
c. Novel goal state data model for controller-agent communication
d. Preliminary implementation of Fast Path
e. Onebox development setup for control plane E2E
f. Integration with Kubernetes for container network provisioning