Document Enhancer
What is the purpose of this Helm chart
This Helm chart provides a jadice flow stack for enhancing documents by converting them to a more accessible format.
Example: An input file (PDF, Office, MODCA or Tiff) is converted to HTML
Where to get
After you received the credentials (technical user) for accessing our Helm Chart and Container Registry, you can log in with these credentials at https://artifacts.jadice.com and change your password.
To add a Helm repository to your repositories you can then execute the following commands:
$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
When you see the password prompt please enter your password.
You now have access to the Helm chart jf-document-enhancer
in your helm cli.
There is also the option to download the Helm Chart with the command:
$ helm pull levigo/jf-document-enhancer
TL;DR
You need to provide the container registry credentials for the jadice flow images (usually the same as for the Helm repository, or your own private registry).
A minimal values.yaml with these required parameters would look something like this:
secrets:
imageRegistry:
server: registry.jadice.com
username: myJadiceRegistryUser
password: myJadiceRegistryPassword
To install the chart from the levigo helm repository with the release name my-release:
$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-document-enhancer
Installation and Configuration
Prerequisites
Kubernetes
- Kubernetes 1.19+
- Helm 3.1.0+
- an Ingress Controller
Container Image Access
Because the images used in this chart are from a private container registry you need to have
- access to the container registry
registry.jadice.com
OR a proxy of it
For details see chapter configuration → image pull secrets.
Storage
The default storage for storing job results is Eureka. It requires zero configuration for test installations. For enabling persistence and configuring other storage backends, see configure storage.
Installing the Chart
To install the chart with the release name my-release
:
$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-document-enhancer
The command deploys jf-document-enhancer on the Kubernetes cluster in the default configuration. The configuration section lists the parameters that can be configured during installation.
Uninstalling the Chart
To uninstall the my-release
deployment:
$ helm uninstall my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.
Optional: Configure external database
The default database is an embedded H2-Database. If you want to use an existing external database, provide a connection to your database via jf-controller.datasource.jdbcURL
.
## runtime database connection
## On MySQL, add "?createDatabaseIfNotExist=true" to auto-create schema
## (username and password are taken from the values secrets.database.username and secrets.database.password
## or from the kubernetes secret referenced by jf-controller.auth.existingDBSecret)
jf-controller:
datasource:
jdbcUrl: "jdbc:h2:mem:jadice-flow-db;INIT=CREATE SCHEMA IF NOT EXISTS JADICE_FLOW"
Configure external access
If you want access the service from outside the cluster you can define Ingress resources that could look like this:
Ingress Example
jf-controller:
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/configuration-snippet: |
server_tokens off;
location /actuator {
deny all;
return 403;
}
hosts:
- host: jf-controller.acme.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: jf-controller.acme.com
hosts:
- jf-controller.acme.com
You can also create an Ingress for the controller-ui respectively.
Keep in mind this is only an example making certain assumptions. E.g. this assumes you are using Cert Manager and have defined a cluster-issuer with the name of letsencrypt-prod
.
For more information on Cert Manager and how to create a certificate, see Certificate. It also hides the actuator
endpoints from external access. Depending on your monitoring solution this may not be feasible.
Configure Storage
The default storage backend is Eureka but it is also possible to choose S3 as storage backend. To change the storage from eureka to S3 you have to change the global parameter global.jadiceFlow.storageType in the values.yaml to "s3" (see global parameters) and add the following parameters to the values.yaml:
Parameter | Description | Default |
---|---|---|
secrets.s3.bucket | the S3 bucket | nil |
secrets.s3.endpoint | the URL of the S3 server | nil |
secrets.s3.accessKey | the S3 access key | nil |
secrets.s3.secretKey | the S3 secret key | nil |
secrets.s3.protocol | HTTP or HTTPS | https |
Note: To enable eureka persistence storage add the following values:
eureka:
## @param persistence.enabled Enable persistence on neverpile eureka using a `PersistentVolumeClaim` (using the clusters default storage class)
persistence:
enabled: true
Validate Installation
Check that every pod is running and ready
You can check the status of the deployment by checking if all the pods are up and running:
kubectl get pods
This will return something like this:
NAME READY STATUS RESTARTS AGE
eureka-0 1/1 Running 0 133m
jf-controller-78bd7d5b7b-rw6q2 1/1 Running 0 133m
jf-controller-ui-58f75c9b69-hxpsq 1/1 Running 0 133m
jf-worker-afp-converter-df896694c-7qvmw 1/1 Running 0 133m
jf-worker-analyzer-5fb857cb5c-qdhw7 1/1 Running 0 133m
jf-worker-document-content-extractor-788c49c4f5-j7vgz 1/1 Running 0 133m
jf-worker-libreoffice-bcbfbb546-7c9k2 1/1 Running 0 133m
jf-worker-tessocr-c8d856bf7-5kqvg 1/1 Running 0 133m
Check functionality with Postman
To validate your deployment you can use the Postman
collection jf-document-enhancer.postman_collection.json
that is part of this Helm chart.
You can get it by pulling the Chart with ''helm pull levigo/jf-document-enhancer'. For instructions
on how to import a collection see Importing data into Postman.
After you have imported the collection you need to create a variable JF_CONTROLLER_URL
pointing to your jf-controller
(e.g. https://jf-controller.email.acme.com). Information about the jadice flow REST-API can be found here.
Troubleshooting
If one or more of the pods stay pending, it could mean that it can not be scheduled onto a node. This can happen if there are not enough resources available. You can check for messages from the scheduler with:
kubectl describe pods <POD_NAME>
If the output of the described command does not provide enough information, you can tail the logs of a pod with this command:
kubectl logs -f <POD_NAME>
Included Workers
- worker-analyzer
- worker-libreoffice
- worker-afp-converter
- worker-tessocr
- worker-document-content-extractor
More Information
For more information you check out the Readme.md included in the Helm Chart. You can find the Readme.md in the tgz file that can be downloaded with the command:
helm pull levigo/jf-document-enhancer