Canary Rollout
In this section, we will introduce how to canary rollout a container service.
Before starting
- Enable - kruise-rolloutaddon, our canary rollout capability relies on the rollouts from OpenKruise.- vela addon enable kruise-rollout
- Please make sure one of the ingress controllers is available in your cluster. You can also enable the - ingress-nginxaddon if you don't have any:- vela addon enable ingress-nginx- Please refer to the addon doc to get the access address of gateway. 
- Some of the commands such as - rollbackrelies on vela-cli- >=1.5.0-alpha.1, please upgrade the command line for convenience. You don't need to upgrade the controller.
First Time Deploy
When you want to use the canary rollout for every upgrade, you should ALWAYS have a kruise-rollout trait on your component.
The day-2 canary rollout of the component need you have this trait attached already. Deploy the application with traits like below:
cat <<EOF | vela up -f -
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: canary-demo
  annotations:
    app.oam.dev/publishVersion: v1
spec:
  components:
  - name: canary-demo
    type: webservice
    properties:
      image: barnett/canarydemo:v1
      ports:
      - port: 8090
    traits:
    - type: scaler
      properties:
        replicas: 5
    - type: gateway
      properties:
        domain: canary-demo.com
        http:
          "/version": 8090
    - type: kruise-rollout
      properties:
        canary:
          steps:
           # The first batch of Canary releases 20% Pods, and 20% traffic imported to the new version, require manual confirmation before subsequent releases are completed
          - weight: 20
          # The second batch of Canary releases 90% Pods, and 90% traffic imported to the new version.
          - weight: 90
          trafficRoutings:
            - type: ingress
EOF
Here's an overview about what will happen when upgrade under this kruise-rollout trait configuration, the whole process will be divided into 3 steps:
- When the upgrade start, a new canary deployment will be created with 20%of the total replicas. In our example, we have 5 total replicas, it will keep all the old ones and create5 * 20% = 1for the new canary, and serve for20%of the traffic. It will wait for a manual approval when everything gets ready.- By default, the percent of replicas are aligned with the traffic, you can also configure the replicas individually according to this doc.
 
- After the manual approval, the second batch starts. It will create 5 * 90% = 4.5which is actually5replicas of new version in the system with the90%traffic. As a result, the system will totally have10replicas now. It will wait for a second manual approval.
- After the second approval, it will update the workload which means leverage the rolling update mechanism of the workload itself for upgrade. After the workload finished the upgrade, all the traffic will route to that workload and the canary deployment will be destroyed.
Let's continue our demo, the first deployment has no difference with a normal deploy, you can check the status of application to make sure it's running for our next step.
$ vela status canary-demo
About:
  Name:         canary-demo                  
  Namespace:    default                      
  Created at:   2022-06-09 16:43:10 +0800 CST
  Status:       running                      
...snip...
Services:
  - Name: canary-demo  
    Cluster: local  Namespace: default
    Type: webservice
    Healthy Ready:5/5
    Traits:
      ✅ scaler      ✅ gateway: No loadBalancer found, visiting by using 'vela port-forward canary-demo'
      ✅ kruise-rollout: rollout is healthy
If you have enabled velaux addon, you can view the application topology graph that all v1 pods are ready now.

Access the gateway endpoint with the specific host by:
$ curl -H "Host: canary-demo.com" <ingress-controller-address>/version
Demo: V1
The host canary-demo.com is aligned with the gateway trait in your application, you can also configure it in your /etc/hosts to use the host url for visiting.
Day-2 Canary Release
Let's modify the image tag of the component, from v1 to v2 as follows:
cat <<EOF | vela up -f -
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: canary-demo
  annotations:
    app.oam.dev/publishVersion: v2
spec:
  components:
  - name: canary-demo
    type: webservice
    properties:
      image: barnett/canarydemo:v2
      ports:
      - port: 8090
    traits:
    - type: scaler
      properties:
        replicas: 5
    - type: gateway
      properties:
        domain: canary-demo.com
        http:
          "/version": 8090
    - type: kruise-rollout
      properties:
        canary:
          # The first batch of Canary releases 20% Pods, and 20% traffic imported to the new version, require manual confirmation before subsequent releases are completed
          steps:
          - weight: 20
          - weight: 90
          trafficRoutings:
          - type: ingress
EOF
It will create a canary deployment and wait for manual approval, check the status of the application:
$ vela status canary-demo
About:
  Name:         canary-demo                  
  Namespace:    default                      
  Created at:   2022-06-09 16:43:10 +0800 CST
  Status:       runningWorkflow              
...snip...
Services:
  - Name: canary-demo  
    Cluster: local  Namespace: default
    Type: webservice
    Unhealthy Ready:5/5
    Traits:
      ✅ scaler      ✅ gateway: No loadBalancer found, visiting by using 'vela port-forward canary-demo'
      ❌ kruise-rollout: Rollout is in step(1/1), and you need manually confirm to enter the next step
The application's status is runningWorkflow that means the application's rollout process has not finished yet.
View topology graph again, you will see kruise-rollout trait created a v2 pod, and this pod will serve the canary traffic. Meanwhile, the pods of v1 are still running and server non-canary traffic.

Access the gateway endpoint again. You will find out there is about 20% chance to meet Demo: v2 result.
$ curl -H "Host: canary-demo.com" <ingress-controller-address>/version
Demo: V2
Continue Canary Process
After verify the success of the canary version through business-related metrics, such as logs, metrics, and other means, you can resume the workflow to continue the process of rollout.
vela workflow resume canary-demo
Access the gateway endpoint again multi times. You will find out the chance (90%) to meet result Demo: v2 is highly increased.
$ curl -H "Host: canary-demo.com" <ingress-controller-address>/version
Demo: V2
Canary validation succeed, finished the release
In the end, you can resume again to finish the rollout process.
vela workflow resume canary-demo
Access the gateway endpoint again multi times. You will find out the result always is Demo: v2.
$ curl -H "Host: canary-demo.com" <ingress-controller-address>/version
Demo: V2
Canary verification failed, rollback the release
If you want to cancel the rollout process and rollback the application to the latest version, after manually check. You can rollback the rollout workflow:
You should suspend the workflow before rollback:
$ vela workflow suspend canary-demo
Rollout default/canary-demo in cluster  suspended.
Successfully suspend workflow: canary-demo
Then rollback:
$ vela workflow rollback canary-demo
Application spec rollback successfully.
Application status rollback successfully.
Rollout default/canary-demo in cluster  rollback.
Successfully rollback rolloutApplication outdated revision cleaned up.
Access the gateway endpoint again. You can see the result is always Demo: V1.
$ curl -H "Host: canary-demo.com" <ingress-controller-address>/version
Demo: V1
Any rollback operation in middle of a runningWorkflow will rollback to the latest succeeded revision of this application. So, if you deploy a successful v1 and upgrade to v2, but this version didn't succeed while you continue to upgrade to v3. The rollback of v3 will automatically to v1, because release v2 is not a succeeded one.