Kubernetes | Wang's Tech Blog

Tagged: Kubernetes Toggle Comment Threads | Keyboard Shortcuts

Wang 22:12 on 2019-02-11 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Performance ( 35 ), Restful ( 10 ), Security ( 22 ), Stress Test ( 6 )

Guarantee service availability in kubernetes

A good service not only provide good functionalities, but also ensure the availability and uptime.

We reinforce our service from QoS, QPS, Throttling, Scaling, Throughput, Monitoring.

Qos

There’re 3 kinds of QoS in kubernetes: Guaranteed, Burstable, BestEffort. We usually use Guaranteed, Burstable for different services.

#Guaranteed
resources:
  requests:
    cpu: 1000m
    memory: 4Gi
  limits:
    cpu: 1000m
    memory: 4Gi

#Burstable
resources:
  requests:
    cpu: 1000m
    memory: 4Gi
  limits:
    cpu: 6000m
    memory: 8Gi

QPS

We did lots of stress test on APIs by Gatling before we release them, we mainly care about mean response time, std deviation, mean requests/sec, error rate (API Testing Report), during testing we monitor server metrics by Datadog to find out bottlenecks.

We usually test APIs in two scenarios: internal, external. External testing result is much lower than internal testing because of network latency, network bandwidth and son on.

Internal testing result

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     246000 (OK=246000 KO=0     )
> min response time                                     16 (OK=16     KO=-     )
> max response time                                   5891 (OK=5891   KO=-     )
> mean response time                                    86 (OK=86     KO=-     )
> std deviation                                        345 (OK=345    KO=-     )
> response time 50th percentile                         30 (OK=30     KO=-     )
> response time 75th percentile                         40 (OK=40     KO=-     )
> response time 95th percentile                         88 (OK=88     KO=-     )
> response time 99th percentile                       1940 (OK=1940   KO=-     )
> mean requests/sec                                817.276 (OK=817.276 KO=-     )
---- Response Time Distraaibution ------------------------------------------------
> t < 800 ms                                        240565 ( 98%)
> 800 ms < t < 1200 ms                                1110 (  0%)
> t > 1200 ms                                         4325 (  2%)
> failed                                                 0 (  0%)
================================================================================

External testing result

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      33000 (OK=32999  KO=1     )
> min response time                                    477 (OK=477    KO=60001 )
> max response time                                  60001 (OK=41751  KO=60001 )
> mean response time                                   600 (OK=599    KO=60001 )
> std deviation                                        584 (OK=484    KO=0     )
> response time 50th percentile                        497 (OK=497    KO=60001 )
> response time 75th percentile                        506 (OK=506    KO=60001 )
> response time 95th percentile                       1366 (OK=1366   KO=60001 )
> response time 99th percentile                       2125 (OK=2122   KO=60001 )
> mean requests/sec                                109.635 (OK=109.631 KO=0.003 )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         29826 ( 90%)
> 800 ms < t < 1200 ms                                1166 (  4%)
> t > 1200 ms                                         2007 (  6%)
> failed                                                 1 (  0%)
---- Errors --------------------------------------------------------------------
> i.g.h.c.i.RequestTimeoutException: Request timeout after 60000      1 (100.0%)
 ms
================================================================================

Throttling

We throttle API by Nginx limit, we configured ingress like this:

annotations:
  nginx.ingress.kubernetes.io/limit-connections: '30'
  nginx.ingress.kubernetes.io/limit-rps: '60'

And it will generate Nginx configuration dynamically like this:

limit_conn_zone $limit_ZGVsaXZlcnktY2RuYV9kc2QtYXBpLWNkbmEtZ2F0ZXdheQ zone=xxx_conn:5m;
limit_req_zone $limit_ZGVsaXZlcnktY2RuYV9kc2QtYXBpLWNkbmEtZ2F0ZXdheQ zone=xxx_rps:5m rate=60r/s;

server {
    server_name xxx.xxx ;
    listen 80;
    
    location ~* "^/xxx/?(?<baseuri>.*)" {
        ...
        ...        
        limit_conn xxx_conn 30;
        limit_req zone=xxx_rps burst=300 nodelay;
        ...
        ...        
}

Scaling

We use HPA in kubernetes to ensure auto (Auto scaling in kubernetes), you could check HPA status in server:

[xxx@xxx ~]$ kubectl get hpa -n test-ns
NAME       REFERENCE             TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
api-demo   Deployment/api-demo   39%/30%, 0%/30%   3         10        3          126d

[xxx@xxx ~]$ kubectl get pod -n test-ns
NAME                           READY     STATUS    RESTARTS   AGE
api-demo-76b9954f57-6hvzx      1/1       Running   0          126d
api-demo-76b9954f57-mllsx      1/1       Running   0          126d
api-demo-76b9954f57-s22k8      1/1       Running   0          126d

Throughput & Monitoring

We integrated Datadog for monitoring(Monitoring by Datadog), we could check detail API metrics from various dashboards.

Also we could calculate throughout from user, request, request time.

Wang 21:26 on 2019-01-14 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Monitoring ( 2 ), Performance ( 35 )

Monitoring by Datadog

We have thousands of containers running on hundreds of servers, so we need comprehensive monitoring system to monitor service and server metrics.

We investigated popular cloud monitoring platform: New Relic and Datadog, finally we decided to use datadog.

Dashboard: Datadog could detect services and configure dashboards for you automatically.

Container & Process: You could check all your containers & process in all environments clearly.

Monitors: Datadog will create monitors according to service type automatically, if it doesn’t your requirement, you could create your own. It’s also convenient to send alert message through Slack, Email.

APM: Datadog provide various charts for API analysis, also there’s Service Map which you could check service dependencies.

Synthetics: New feature in Datadog which could test your API around the world to check availability and uptime.

Like Loading...
Wang 21:44 on 2018-11-20 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Spring Boot ( 10 )
Sticky session in Kubernetes

As we know RESTful API is stateless, every request will be forward to backend server by round robin mechanism.

But in some scenario we need sticky session which means request from one client should be forward to one backend server.

After checking kubernetes documentation we added some annotations under ingress configuration, and it works well.
```
annotations:
  nginx.ingress.kubernetes.io/affinity: "cookie"
  nginx.ingress.kubernetes.io/session-cookie-name: "router"
  nginx.ingress.kubernetes.io/session-cookie-hash: "sha1"
```
If you open Developer Tools in Chrome, you will find the cookie.

Like Loading...

Wang 22:21 on 2018-11-05 Permalink | Reply
Tags: BigData ( 37 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Hadoop ( 14 ), Hive ( 15 ), Kubernetes, Presto ( 7 ), Security ( 22 )

[Presto] Secure with LDAP

For security issue we decided to enable LDAP in presto, to deploy presto into kubernetes cluster we build presto image ourselves which include kerberos authentication and LDAP configurations.

As you see the image structure, configurations under catalog/etc/hive are very important, please pay attention.

krb5.conf and xxx.keytab are used to connect to kerberos

password-authenticator.properties and ldap_server.pem under etc, hive.properties and hive-security.json under catalog are used to connect to LDAP.

password-authenticator.properties

password-authenticator.name=ldap
ldap.url=ldaps://<IP>:<PORT>
ldap.user-bind-pattern=xxxxxx
ldap.user-base-dn=xxxxxx

hive.properties

connector.name=hive-hadoop2
hive.security=file
security.config-file=<hive-security.json>
hive.metastore.authentication.type=KERBEROS
hive.metastore.uri=thrift://<IP>:<PORT>
hive.metastore.service.principal=<SERVER-PRINCIPAL>
hive.metastore.client.principal=<CLIENT-PRINCIPAL>
hive.metastore.client.keytab=<KEYTAB>
hive.config.resources=core-site.xml, hdfs-site.xml

hive-security.json

{
  "schemas": [{
    "user": "user_1",
    "schema": "db_1",
    "owner": false
  }, {
    "user": " ",
    "schema": "db_1",
    "owner": false
  }, {
    "user": "user_2",
    "schema": "db_2",
    "owner": false
  }],
  "tables": [{
    "user": "user_1",
    "schema": "db_1",
    "table": "table_1",
    "privileges": ["SELECT"]
  }, {
    "user": "user_1",
    "schema": "db_1",
    "table": "table_2",
    "privileges": ["SELECT"]
  }, {
    "user": "user_2",
    "schema": "db_1",
    "table": ".*",
    "privileges": ["SELECT"]
  }, {
    "user": "user_2",
    "schema": "db_2",
    "table": "table_1",
    "privileges": ["SELECT"]
  }, {
    "user": "user_2",
    "schema": "db_2",
    "table": "table_2",
    "privileges": ["SELECT"]
  }],
  "sessionProperties": [{
    "allow": false
  }]
}

Wang 22:30 on 2018-10-15 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes
Jenkins pipeline & kubernetes

We build deployment pipeline by Jenkins, Git, Maven, Docker, JFrog, Kubernetes, Slack, below is overall process:
```
develop -> create branch -> push code -> git hook -> jenkins build -> code check -> unit test -> docker build -> push docker image -> deploy -> notificationa
```
For every project we generate pipeline scripts by JHipster like this:

ci contains docker related scripts, cd contains kubernetes related scripts.

We configured Jenkins to scan projects from git automatically which followed naming rule, if any changes on git, Jenkins will pull the code and start building.

Like Loading...
Wang 22:43 on 2018-10-08 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes, Restful ( 10 ), Spring Boot ( 10 )
Nginx ingress in kubernetes

There are 3 ways to expose your service: NodePort, LoadBalancer, Ingress, next I will introduce about how to use ingress.

1.Deploy ingress controller

You need deploy ingress controller at first which will start nginx pods, then nginx will bind domains and listen to the requests.

I built a common ingress chart for different service, I only need change values-<service>.yaml and deploy script if any changes.

Another key point is that you must be clear about ingress-class, different service use different ingress-class, it will be quite messy if you mistake them.
```
args:
  - /nginx-ingress-controller
  - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
  - --configmap=$(POD_NAMESPACE)/nginx-configuration
  - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
  - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
  - --ingress-class={{ .Values.server.namespace }}
  - --sort-backends=true
```
2.Configure service ingress

Next we need configure service ingress which will append nginx server configuration dynamically.

I also built a service chart which include environment configurations, Jenkins & Helm will use different values-<env>.yaml when execute pipeline deployment.

Ingress example:
```
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: {{ .Values.app.name }}{{ .Values.deploy.subfix }}
  namespace: {{ .Values.app.namespace }}
  annotations:
    kubernetes.io/ingress.class: "{{ .Values.ingress.class }}"
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/enable-cors: "false"
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/proxy-body-size: 10m
spec:
  rules:
  - host: {{ .Values.ingress.hostname }}
    http:
      paths:
      - path: {{ .Values.ingress.path }}
        backend:
          serviceName: {{ .Values.app.name }}{{ .Values.deploy.subfix }}
          servicePort: {{ .Values.container.port }}
```
Like Loading...
Wang 21:23 on 2018-09-06 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )
Probe in kubernetes

There’s two kinds of probe: readinessProbe, livenessProbe in kubernetes used to detect if your service is healthy.

We encountered a problem when configured readinessProbe, there’s a property named initialDelaySeconds which indicate kubernetes will start health check after specific second, we used the default value 60 which means kubernetes will check health after 60 seconds.
```
readinessProbe:
  initialDelaySeconds: 60
  timeoutSeconds: 5
```
As we deployed over 20 StatefulSet pods and these pods joined as a cluster which cost over 60 seconds, kubernetes can’t ping service successfully so that kubernetes restart these pods, thees pods restart in loop all the time.

After we increased the initialDelaySeconds to 120, everything goes fine.

Like Loading...
Wang 21:56 on 2018-08-22 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Storage ( 3 )
Stateful deployment in kubernetes

If you deploy pod by setting “kind: Deployment“, you will lost your data when the pod restart or being deleted.

It’s not acceptable when we want to deploy storage system like Redis, Elasticsearch, in this case we need use StatefulSet.

For the concrete explanation please refer to the official documentation, StatefulSet use PVC(Persistent Volume Claim) as storage, and it will exist all the time no matter what happened to the pod.

You must specify PVC in StatefulSet’s yaml file like this:
```
volumeClaimTemplates:
- metadata:
  name: redis
spec:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: fast
  resources:
    requests:
      storage: 10Gi
```
Please also pay attention to PVC’s name, there’s a rule for StatefulSet and PVC name mapping which IS NOT covered by documentation.

Like Loading...
Wang 19:25 on 2018-08-11 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )
Auto scaling in kubernetes

When we deploy a API in kubernets we must define replication number for the pod, but as we know there will be high traffic during peak time and we usually can’t estimate service capacity exactly at first time, in this case we must scale our service like creating more pods to share online traffic to avoid service crash down.

We usually scale service manually before using kubernetes, append more nodes during peak time and destroy nodes when the traffic became smooth.

In kubernetes there’s a kind of feature called HPA(Horizontal Pod Autoscaler) which could help your scale service automatically. You could specify minimum and maximum replica number in yaml file, HPA will monitor pod’s CPU and Memory by collecting pod’s metric, if HPA found your pod’s metric is over the threshold number which you defined in yaml file, it will create more pods automatically and join the service cluster to load the traffic.

Here is a simple HPA samle:
```
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-demo
  namespace: test-ns
  labels:
    app: hpa-demo
    component: api
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-demo
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 75
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 75
```
I defined there’s will be at least 3 replicas for the pod, if the CPU or Memory usage is over 75%, HPA will create at most 10 pods.

HPA monitor pod’s metric by using metrics-server.

Like Loading...
Wang 22:03 on 2018-07-30 Permalink | Reply
Tags: Docker ( 29 ), Helm ( 2 ), Kubernetes
Deploy service by Helm in Kubernets

As we know, if you want to deploy a service, you need at first write several yaml files like deployment/service/ingress file and so on.

Then execute several times kubectl create -f <Yaml File> when you create service, also you need delete several times when you destroy service, It’s a little boring…

Although you could write all the configurations in just one yaml file, but it’s hard to maintain. For example you can’t define variable which used in many pods, you can’t upgrade or rollback deployment easily..

By using Helm you will find it’s very easy to solve these problems, just execute one command like helm install <Chart>, then helm will deploy all the pods at meanwhile, you could check deployment’s status by helm list, upgrade service by helm upgrade and so on.

There’s lots of stable charts in Helm repository, you could also define chart yourself if it doesn’t meet your requirement.

Here is chart’s structure from Helm official website:
```
wordpress/
  Chart.yaml          # A YAML file containing information about the chart
  LICENSE             # OPTIONAL: A plain text file containing the license for the chart
  README.md           # OPTIONAL: A human-readable README file
  requirements.yaml   # OPTIONAL: A YAML file listing dependencies for the chart
  values.yaml         # The default configuration values for this chart
  charts/             # A directory containing any charts upon which this chart depends.
  templates/          # A directory of templates that, when combined with values,
                      # will generate valid Kubernetes manifest files.
  templates/NOTES.txt # OPTIONAL: A plain text file containing short usage notes
```
Like Loading...

Wang's Tech Blog

Recent Posts

Tags

Archives

My Girl

Tagged: Kubernetes Toggle Comment Threads | Keyboard Shortcuts

Wang 22:03 on 2018-07-30 Permalink | Reply
Tags: Docker ( 29 ), Helm ( 2 ), Kubernetes

Deploy service by Helm in Kubernets

Recent Posts

Tags

Archives

My Girl

Tagged: Kubernetes Toggle Comment Threads | Keyboard Shortcuts

Wang 22:12 on 2019-02-11 Permalink | Reply Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Performance ( 35 ), Restful ( 10 ), Security ( 22 ), Stress Test ( 6 )

Qos

QPS

Throttling

Scaling

Throughput & Monitoring

Reply Cancel reply

Wang 21:26 on 2019-01-14 Permalink | Reply Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Monitoring ( 2 ), Performance ( 35 )

Wang 21:44 on 2018-11-20 Permalink | Reply Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Spring Boot ( 10 )

Wang 22:21 on 2018-11-05 Permalink | Reply Tags: BigData ( 37 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Hadoop ( 14 ), Hive ( 15 ), Kubernetes, Presto ( 7 ), Security ( 22 )

Wang 22:30 on 2018-10-15 Permalink | Reply Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes

Wang 22:43 on 2018-10-08 Permalink | Reply Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes, Restful ( 10 ), Spring Boot ( 10 )

Wang 21:23 on 2018-09-06 Permalink | Reply Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )

Wang 21:56 on 2018-08-22 Permalink | Reply Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Storage ( 3 )

Wang 19:25 on 2018-08-11 Permalink | Reply Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )

Wang 22:03 on 2018-07-30 Permalink | Reply Tags: Docker ( 29 ), Helm ( 2 ), Kubernetes

Wang 22:12 on 2019-02-11 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Performance ( 35 ), Restful ( 10 ), Security ( 22 ), Stress Test ( 6 )

Wang 21:26 on 2019-01-14 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Monitoring ( 2 ), Performance ( 35 )

Wang 21:44 on 2018-11-20 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Nginx ( 4 ), Spring Boot ( 10 )

Wang 22:21 on 2018-11-05 Permalink | Reply
Tags: BigData ( 37 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Hadoop ( 14 ), Hive ( 15 ), Kubernetes, Presto ( 7 ), Security ( 22 )

Wang 22:30 on 2018-10-15 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes

Wang 22:43 on 2018-10-08 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Domain ( 7 ), Kubernetes, Restful ( 10 ), Spring Boot ( 10 )

Wang 21:23 on 2018-09-06 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )

Wang 21:56 on 2018-08-22 Permalink | Reply
Tags: Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Storage ( 3 )

Wang 19:25 on 2018-08-11 Permalink | Reply
Tags: API ( 18 ), Cloud ( 30 ), Cluster ( 30 ), Docker ( 29 ), Kubernetes, Performance ( 35 )

Wang 22:03 on 2018-07-30 Permalink | Reply
Tags: Docker ( 29 ), Helm ( 2 ), Kubernetes