Skip to main content
Version: 1.3.2

Document Enhancer

What is the purpose of this Helm chart

This Helm chart provides a jadice flow stack for enhancing documents by converting them to a more accessible format.

Document Enhancer Workflow Example: An input file (PDF, Office, MODCA or Tiff) is converted to HTML

Where to get

After you received the credentials (technical user) for accessing our Helm Chart and Container Registry, you can log in with these credentials at https://artifacts.jadice.com and change your password.

To add a Helm repository to your repositories you can then execute the following commands:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>

When you see the password prompt please enter your password.

You now have access to the Helm chart jf-document-enhancer in your helm cli.

There is also the option to download the Helm Chart with the command:

$ helm pull levigo/jf-document-enhancer

TL;DR

You need to provide the container registry credentials for the jadice flow images (usually the same as for the Helm repository, or your own private registry).

A minimal values.yaml with these required parameters would look something like this:

secrets:
imageRegistry:
server: registry.jadice.com
username: myJadiceRegistryUser
password: myJadiceRegistryPassword

To install the chart from the levigo helm repository with the release name my-release:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-document-enhancer

Installation and Configuration

Prerequisites

Kubernetes
Container Image Access

Because the images used in this chart are from a private container registry you need to have

  • access to the container registry registry.jadice.com OR a proxy of it

For details see chapter configuration → image pull secrets.

Storage

The default storage for storing job results is Eureka. It requires zero configuration for test installations. For enabling persistence and configuring other storage backends, see configure storage.

Installing the Chart

To install the chart with the release name my-release:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-document-enhancer

The command deploys jf-document-enhancer on the Kubernetes cluster in the default configuration. The configuration section lists the parameters that can be configured during installation.

Uninstalling the Chart

To uninstall the my-release deployment:

$ helm uninstall my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.

Optional: Configure external database

The default database is an embedded H2-Database. If you want to use an existing external database, provide a connection to your database via jf-controller.datasource.jdbcURL.

## runtime database connection
## On MySQL, add "?createDatabaseIfNotExist=true" to auto-create schema
## (username and password are taken from the values secrets.database.username and secrets.database.password
## or from the kubernetes secret referenced by jf-controller.auth.existingDBSecret)
jf-controller:
datasource:
jdbcUrl: "jdbc:h2:mem:jadice-flow-db;INIT=CREATE SCHEMA IF NOT EXISTS JADICE_FLOW"

Configure external access

If you want access the service from outside the cluster you can define Ingress resources that could look like this:

Ingress Example
jf-controller:
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/configuration-snippet: |
server_tokens off;
location /actuator {
deny all;
return 403;
}
hosts:
- host: jf-controller.acme.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: jf-controller.acme.com
hosts:
- jf-controller.acme.com

You can also create an Ingress for the controller-ui respectively.

Keep in mind this is only an example making certain assumptions. E.g. this assumes you are using Cert Manager and have defined a cluster-issuer with the name of letsencrypt-prod. For more information on Cert Manager and how to create a certificate, see Certificate. It also hides the actuator endpoints from external access. Depending on your monitoring solution this may not be feasible.

Configure Storage

The default storage backend is Eureka but it is also possible to choose S3 as storage backend. To change the storage from eureka to S3 you have to change the global parameter global.jadiceFlow.storageType in the values.yaml to "s3" (see global parameters) and add the following parameters to the values.yaml:

ParameterDescriptionDefault
secrets.s3.bucketthe S3 bucketnil
secrets.s3.endpointthe URL of the S3 servernil
secrets.s3.accessKeythe S3 access keynil
secrets.s3.secretKeythe S3 secret keynil
secrets.s3.protocolHTTP or HTTPShttps

Note: To enable eureka persistence storage add the following values:

eureka:
## @param persistence.enabled Enable persistence on neverpile eureka using a `PersistentVolumeClaim` (using the clusters default storage class)
persistence:
enabled: true

Validate Installation

Check that every pod is running and ready

You can check the status of the deployment by checking if all the pods are up and running:

kubectl get pods

This will return something like this:

NAME                                                    READY   STATUS    RESTARTS   AGE
eureka-0 1/1 Running 0 133m
jf-controller-78bd7d5b7b-rw6q2 1/1 Running 0 133m
jf-controller-ui-58f75c9b69-hxpsq 1/1 Running 0 133m
jf-worker-afp-converter-df896694c-7qvmw 1/1 Running 0 133m
jf-worker-analyzer-5fb857cb5c-qdhw7 1/1 Running 0 133m
jf-worker-document-content-extractor-788c49c4f5-j7vgz 1/1 Running 0 133m
jf-worker-libreoffice-bcbfbb546-7c9k2 1/1 Running 0 133m
jf-worker-tessocr-c8d856bf7-5kqvg 1/1 Running 0 133m

Check functionality with Postman

To validate your deployment you can use the Postman collection jf-document-enhancer.postman_collection.json that is part of this Helm chart. You can get it by pulling the Chart with ''helm pull levigo/jf-document-enhancer'. For instructions on how to import a collection see Importing data into Postman.

After you have imported the collection you need to create a variable JF_CONTROLLER_URL pointing to your jf-controller (e.g. https://jf-controller.email.acme.com). Information about the jadice flow REST-API can be found here.

Troubleshooting

If one or more of the pods stay pending, it could mean that it can not be scheduled onto a node. This can happen if there are not enough resources available. You can check for messages from the scheduler with:

kubectl describe pods <POD_NAME>

If the output of the described command does not provide enough information, you can tail the logs of a pod with this command:

kubectl logs -f <POD_NAME>

Included Workers

More Information

For more information you check out the Readme.md included in the Helm Chart. You can find the Readme.md in the tgz file that can be downloaded with the command:

helm pull levigo/jf-document-enhancer