Skip to main content
Version: 1.3.2

OCR Conversion

What is the purpose of this Helm chart

This product is an example for a jadice flow product. It provides a jadice flow stack for OCR (Optical Character Recognition).

Supported Formats

The included worker-tessocr can perform OCR on the following input formats:

  • TIFF, JPEG, GIF, PNG, and BMP image formats
  • Multi-page TIFF images
  • PDF document format

Other formats must be converted first. This can be done in a jadice flow jobtemplate. For more information see worker-tessocr.

Where to get

After you received the credentials (technical user) for accessing our Helm Chart and Container Registry, you can log in with these credentials at https://artifacts.jadice.com and change your password.

To add a Helm repository to your repositories you can then execute the following commands:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>

When you see the password prompt please enter your password.

You now have access to the Helm chart jf-ocr in your helm cli.

There is also the option to download the Helm Chart with the command:

$ helm pull levigo/jf-ocr

TL;DR

You need to provide some Required Parameters. These are:

  • the secrets.controller.accessToken, used by the jf-controller for authentication to the workers
  • the storage credentials - e.g. S3
  • the container registry credentials for the jadice flow images (usually the same as for the Helm repository, or your own private registry)

The credentials for the controller database - by default H2 - can be changed, too.

A minimal values.yaml with these required parameters would look something like this:

secrets:
useSealedSecrets: false
controller:
accessToken: MY-ACCESS-TOKEN
database:
username: myUsername
password: myPassword
s3:
bucket: jadice-flow-bucket
endpoint: s3.acme.com
accessKey: myAccessKey
secretKey: mySecretKey
## uncomment when using WebDAV and remove S3 block above
# webdav:
# endpoint: https://webdav.acme.com/
# username: myUser
# password: myPassword
# outputPath: results
## values you'll probably need, because jadice images are only available in a private registry
jadiceFlow:
image:
## path to proxy registry of "registry.jadice.com"
## (required if direct access to "registry.jadice.com" is not possible)
registry: jadice.proxy.registry.acme.com
## pull secret for proxy registry or "registry.jadice.com"
pullSecrets:
- "my-pull-secret"

To install the chart from the levigo helm repository with the release name my-release:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-ocr

Installation and Configuration

Prerequisites

Kubernetes
Storage
  • The default storage for storing job results is Eureka. It requires zero configuration for test installations. It's also possible to enabling persistence and configuring other storage backends.
Container Image Access

Because the images used in this chart are from a private container registry you need to have

  • access to the container registry registry.jadice.com OR a proxy of it
API Access Token
  • a jadice flow access token

Installing the Chart

To install the chart with the release name my-release:

$ helm repo add levigo https://artifacts.jadice.com/repository/helm-charts/ --username <username>
Password: <enter your password>
$ helm install --values values.yaml my-release levigo/jf-ocr

The command deploys jf-ocr on the Kubernetes cluster in the default configuration. The configuration section lists the parameters that can be configured during installation.

Uninstalling the Chart

To uninstall the my-release deployment:

$ helm uninstall my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.

Check that every pod is running and ready

You can check the status of the deployment by checking if all the pods are up and running:

kubectl get pods

Check functionality with Postman

To validate your deployment you can use the Postman collection jadice-flow tutorial ocr.postman_collection.json that is part of this Helm chart. You can get it by pulling the Chart with ''helm pull levigo/jf-ocr'. For instructions on how to import a collection see Importing data into Postman.

After you have imported the collection you need to create a variable JF_CONTROLLER_URL pointing to your jf-controller (e.g. https://jf-controller.email.acme.com) and add your access token as a Bearer Token to the collection. Information about the jadice flow REST-API can be found here.

Troubleshooting

If one or more of the pods stay pending, it could mean that it can not be scheduled onto a node. This can happen if there are not enough resources available. You can check for messages from the scheduler with:

kubectl describe pods <POD_NAME>

If the output of the described command does not provide enough information, you can tail the logs of a pod with this command:

Included Workers

More Information

For more information you check out the Readme.md included in the Helm Chart. You can find the Readme.md in the tgz file that can be downloaded with the command:

helm pull levigo/jf-ocr