...
Let’s make a few HTTP requests to the service in parallel
:
Code Block |
---|
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
export INGRESS_HOST=$(minikube ip)
while true; \
do curl http://$INGRESS_HOST:$INGRESS_PORT \
--header 'Host: helloworld-go.default.example.com' \
&& sleep 0.3; \
done
Hello Go Sample v1!
Hello Go Sample v1!
Hello Go Sample v1!
sudo apt-get install parallel
seq 1000 | parallel -n0 \
-j10 "curl -s http://INGRESS_HOST:$INGRESS_PORT \
-H 'Host: helloworldgo.default.example.com'"
Hello Go Sample v1!
Hello Go Sample v1!
Hello Go Sample v1!
... |
Development of a Kotlin SpringBoot REST server into your Knative cluster
Let’s deliver Chuck Norris jokes at scale, via a REST service endpoint developed with Spring Boot, written in Kotlin.
The advantage of this is that the code is easy to read and that the service behavior will be compliant to the standards so that we can do HTTP load-testing. This way, we can see auto-scaling Serverless workloads in action.
I use a Maven package that provides the joke texts via Spring Boot Actuator package. This way I do not need a DBMS.
The application is rather primitive. It offers two endpoints:
/
- returns HTML, via a Thymeleaf template/joke
- returns JSON, via a library and Spring
https://code.because-security.com/marius/chucknorris-kotlin-spring
I published the image to DockerHub. You can also build it yourself and push it into your own registry.
...
I would like to highlight that no code changes are required to run this application in Knative (or Docker). The code remains untouched.
Java 11 (versions higher than 8) is aware of Linux’s cgroups
limits within container environments. This way, the container environment's memory limit is respected by the Java runtime.
– And you do not need to add mystical /dev/urandom
mounts anymore. Just as a side-note.
...
Code Block |
---|
marius@shell:~/Source/chucknorris-kotlin-spring$ kubectl apply -f service.yaml
service.serving.knative.dev/chuckjokes-java-spring created |
Now get the endpoint:
Code Block |
---|
marius@shell:~/ curl -i http://$INGRESS_HOST:$INGRESS_PORT/joke \
-H "Accept: application/json" \
-H 'Host: chuckjokes-java-spring.default.example.com' \
-w "\n"
HTTP/1.1 200 OK
content-length: 55
content-type: application/json
date: Sun, 19 Apr 2020 06:56:19 GMT
server: istio-envoy
x-envoy-upstream-service-time: 7031
{"joke #1":"Chuck Norris's first program was kill -9."} |
Initially, it will be slow because it’s instantiating the application on-demand. You can use this to warm up the cluster 🔥
I tried to get an Ubuntu VM to make more than 200 curl
requests using GNU parallel
. Although I unset the limits, I reached up to 162 requests per second, which is not enough to put the respective cluster under enough stress (the default requests per second to trigger the Knative autoscaler is 200:
Code Block |
---|
marius@shell:~/$ kubectl -n knative-serving describe cm config-autoscaler
# The requests per second (RPS) target default is what the Autoscaler will
# try to maintain ...
requests-per-second-target-default: "200" |
Instead, I used wrk
, also because its invocation is similar to curl
.
Code Block |
---|
marius@shell:~/$ wrk -t12 -c400 -d300s \
http://$INGRESS_HOST:$INGRESS_PORT/joke \
-H 'Host: chuckjokes-java-spring.default.example.com' |
On the right, you can see the Pods serving Chuck Norris jokes. On the left, you can see the curl
command to bring up the initial instance. This is followed by the wrk
command to put the Service under enough load.