Get Started (click here to return to homepage)

What is DataMinder?

DataMinder is a framework for building, running and monitoring processes that in some way modify data.

How does DataMinder work?

DataMinder stores data in a data table that is sent to each plugin in the current process to perform some work on it (e.g. clean up data), add new data (e.g. from database, file, web service call) or write data somewhere (e.g. database, file or mail).

How do I get DataMinder?

I don't read manuals. Get me to a minimal "Get started" further down the page

DataMinder key concepts

With DataMinder you can:

When building processes, you combine DataMinder plugins which are small, configurable building blocks that implement some specific function. The following plugin types exist today:

By combining the plugins into processes, DataMinder can be used to solve a lot of common problems facing programmers or IT administrators e.g.

All plugins are run and managed by DataMinder so you can focus only on the problem instead of spending energy on just getting the surrounding systems to work.

By combining existing plugins into processes or building new plugins only your ingenuity and imagination set the limit to what you can do with DataMinder.

Start building processes and/or develop your own plugin library

With DataMinder you can quickly build automated data flows by combining plugins into processes. Processes can read/write data, modify data etc.

Key concepts:

The basic workflow:

  1. Decide what to do: Create a process that you intend to do something e.g. a web service or sync databases
  2. Collect the plugins: In the plugins library, get all plugins you need to implement the process
  3. Drag’n’Drop plugins to build the process: In the build window, assemble your process while debugging it step by step

What next?

With DataMinder you:

Back to top

Minimal "Get started"

For people in a hurry

Give it less then 5 minutes and you will understand how to work with DataMinder.

The absolutely quickest way to start working with DataMinder is to build a very simple process with one task e.g. creating unique uids (unique identification) and test run it. Let it be accessible by web service interface where a parameter specifies the number of uids to create.

Create simple process with one task to generate uids and return them by web service call

  1. Go to plugins tab and add the following 3 plugins to toolbox with
    • DM Basic Plugins > Uid and counter > "Generate Uid"
    • DM Basic Plugins > Services and Listeners > Http/https > "HTTP Web Service"
    • DM Basic Plugins > Services and Listeners > Http/https > "HTTP Listener"
  2. Go to runnables tab. Create a category named : "My processes".
  3. On the category open to Drag’n’Drop Service "HTTP Web Service" from toolbox to the category and double click the service plugin to configure. Now you have activated a HTTP/HTTPS web service.
    • Allow only localhost = false
    • Allow http from localhost = true
    • Allow http from non-local host = true
    • Name of secret key = (leave empty)
    • Secret key value = (leave empty)
  4. Create a process under category "My processes" named : "Get uids" (or whatever name you prefer).
  5. Go to process and open Scheduler/Listener dialog to Drag’n’Drop Listener "HTTP Listener" and double click it to configure.
    • Listens to service = HTTP Web Service (or name of the HTTP service above)
    • Path to process = /getUids
  6. Go to process and open Build/Debug dialog to Drag’n’Drop Task "Generate Uid" and double click it to configure.
    • UID length = 20 (or whatever length you like)
    • Column to store UIDs in = UID (or whatever column name you like)
    • Number of values to create = mark the checkbox "Value from table" to indicate the value will come from the web service call as a parameter.
  7. Call your process at the web service interface (default web service port 9080) to create 10 uids:

    http://127.0.0.1:9080/getUids?Number of values to create=10

And check that the process returns JSON data:

    {
    "columns":["Number of values to create","UID"]
    ,
    "data":[
    ["10","zwymETliyE2tsyJZcDDI"]
    ,["","X1tFU7PslEByoczSwBSg"]
    ,["","LKWwqESp6cVR0UxqCdSn"]
    ,["","cXT9vTQHUi7KDU7Nx5xm"]
    ,["","XU99g0CngBXVj7UBphXX"]
    ,["","fJSygZQAujlTIrNpESlT"]
    ,["","nYj8FDXSXH3Gm0rJkqla"]
    ,["","R6AQQX75kG6bJcTjGIX4"]
    ,["","c4qFMypOpZEdwIqAVMSq"]
    ,["","imr3TpTLUdPeQ7g9tzF5"]
    ]
    ,
    "results":[
    {"index":"1","objectType":"Task","objectName":"Generate Uid","objectId":"9R",
      "result":{"status":"OK","info":""}},
    {"index":"2","objectType":"Process","objectName":"Get uids as web service","objectId":"9T",
      "result":{"status":"OK","info":""}}]}
  

If you need to call the process with another parameter e.g. http://127.0.0.1:9080/getUids?count=10 you can always add the rename columns task first to rename the column from "count" to "Number of values to create" before calling "Generate Uid" task.
Please see: DM Basic Plugins > Table > "Rename Columns"

If you want to remove the input parameter "Number of values to create" from response you can add the remove column task last in the process flow.
Please see: DM Basic Plugins > Table > "Remove table columns"

In case you jumped over the key concepts we recommend you to have a look at DataMinder key concepts.

Back to top

Install or Upgrade DataMinder

Install or upgrade DataMinder

If you have a previous version of DataMinder installed you should probably upgrade. If not you should do a new install.

In this document we will refer to: /DataMinder root directory as {DM_ROOT}

Install Java (if you haven't already)

In order to run DataMinder there must be a Java runtime installed for your platform. You need to install the latest Java 8 Runtime from: http://www.oracle.com/technetwork/java/javase/downloads/index.html

If available install the Server runtime which is optimised for server environments. You do not need the Java JDK which is the "Java Development Kit" containing tools you do not need running DataMinder.

After you installed Java runtime verify you have the correct version as:

java -version

The response should start with "1.8." and be similar to: java version "1.8.0_162"

Install

NOTE : To install make sure that all ports are available e.g. no previous DataMinder or other program is using the ports. Ports under 1024 may require root or administrator access if they are to be used.

NOTE Windows: When installing on windows the total path may not be more then ca 256 character long. Therefore we recommend to install DataMinder in a location with no more then 50-70 characters. More information here.

Download DataMinder DOWNLOAD.

  1. Unpack the zip file named similar to DataMinder_v(...).zip.
    {DM_ROOT} is the root /DataMinder directory.
  2. Move the {DM_ROOT} directory to where you want DataMinder to be installed.
  3. Open a terminal window and go to the installation root directory, {DM_ROOT}, where the DataMinder.jar is.
  4. To start DataMinder run the command from {DM_ROOT} directory: java -jar DataMinder.jar
  5. Follow the instructions in the terminal window. The installation will start at the default port http://127.0.0.1:8080
    The port can be changed in the file if needed. The installation will only be accessible from local host defined as ip address: 127.0.0.1.
  6. Follow the instructions in the browser to finish the installation.

Upgrade

NOTE : To update make sure that all ports are available e.g. no previous DataMinder or other program is using the ports. Ports under 1024 may require root or administrator access if they are to be used.

If you want the previous configuration available in the new installation you may copy the /Server/Config folder. It contains all the properties, runnables, external plugins etc.

The recommended flow for upgrade is:

  1. Shutdown any previous DataMinder to avoid port collisions. You can find the process number (if it is running) in the pid file {DM_ROOT}/pid.txt
  2. Unpack the zip file named like DataMinder_v(...).zip file.
  3. Move the {DM_ROOT} directory to where you want DataMinder to be installed.
  4. Open a terminal window and go to the installation root directory, {DM_ROOT}, where you unpacked the contents of zip file /DataMinder where the DataMinder.jar is.
  5. Rename the {DM_ROOT}/Server/Config directory in the new installation to {DM_ROOT}/Server/Config_original or something similar.
  6. Copy the {DM_ROOT}/Server/Config directory from the previous installation to the new one.
  7. To start DataMinder run the command from {DM_ROOT} directory: java -jar DataMinder.jar
  8. Login to the new DataMinder at the same https address (default is https://127.0.0.1:8443/DM) and with the same password as before.
  9. Done!

Install DataMinder Licence

To install new licences go to the Monitor tab and open the licence management dialog . Either upload a licence file manually or add a licence download URL.

Java Cryptography Extension (JCE) Unlimited Strength [OPTIONAL]

To use strong encryption you need to update Java. The reason this is not included in Java by default is that US has restrictions of which countries may download strong encryption components and which may not.

If you like you can download the "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files" here. And follow the instructions given there.

Back to top

Build

Create your own solutions

In this part we will go through some more extensive examples of how you can build and configure DataMinder.

To illustrate how to work with DataMinder we start with 2 simple problems that show key concepts of how DataMinder works and what you could do with it.

Disclaimer: The problems are intentionally very simple to illustrate the workflow and concepts without getting stuck in implementation details. Of course you could easily imagine real situations with similar (but more complex) problems.

Problem 1: Read a file and store the data in a database

Each day a CSV (comma-separated values) file is uploaded from our partner company to our ftp server with customer user data. It contains columns FirstName and LastName of customers that registered with our partner. Our job is to import all those names into the common customer database. We need to make sure all names are in right case and to create email address for each user. And import the data to database.

The file (or similar) may be found in the DataMinder Internal Plugin Folder for "DM Basic Plugins" library as {DM_ROOT}/Server/Internal/DMPlugins/DMPluginBasic/testdata/csv/users.csv and accessible from the plugin configuration as: {INSTALL_DIR_ROOT}/testdata/csv/users.csv

FirstNameLastName
ALICEHILL
bobking
ANNEHaRT
jOeSimS
IeobASS
JaneRILEy
LINDAsUtton
LaRRydAVIs
MaRYblake
JacKHOLT

A possible workflow may look like this:

  1. Create a process: "Import users"
  2. Go to plugins and add the following plugins to toolbox.
    • Input: "CSV input" read from csv file
    • Task: "First to upper" makes data in columns starting with uppercase and the rest with lower case
    • Task: "Add values" create email based on values in table columns
    • Task: "To lower case" make email address as lower case
    • Common Object: "SQL connection" creates a reusable connection object to a database instance
    • Output: "SQL insert" creates insert statements based on table data and sends them to database.
    • Scheduler: "Interval scheduler" starts a process at specific intervals
    To find a plugin you can open the top config node or category under the Plugins tab. All plugins at and below that level are then shown. Then you can just do a search in your browser for e.g. "csv".
  3. Set up the database connection by putting "SQL connection" plugin in the same category as our process is in.
  4. Go back to process and open Build/Debug view and start building the process by Drag’n’Drop the plugins into process.
  5. Configure "CSV input" plugin to read the CSV file, run the plugin with "Next ->" and verify the data was imported.
  6. Fix case on the 2 columns FirstName and LastName with plugin "First to upper" and run it with "Next ->" and verify data.
  7. Create emails with plugin "Add values" and store it in Email column.
  8. Make email address all lowercase with plugin "To lower case".
  9. Configure plugin "SQL insert" to use the database connection provided by plugin "SQL connection" and create INSERT statements with all columns and send them to database with "SQL insert".
  10. Run the process in Build/Debug view and verify it works.
  11. As the last thing we want to run the process every hour since the file may get uploaded any time during the day. To do that go to "Schedulers/Listeners" and Drag’n’Drop our scheduler "Interval scheduler" and configure it to start the process every hour.

To summarize : We just created a process that every hour check if a file exits. If the file exist it reads the contents and saves it in database after fixing format and creating email addresses.

There are many more things we may need to do to have a production quality process e.g. remove file after read, verify user or email is not already stored etc. The point here is we created a non-trivial workflow in minutes without having to do any implementation!

Problem 2: Create simple Web Service API returning JSON (JavaScript Object Notation) data

We need to create a simple Web Service API that lets other systems get user data from database just by calling our service with an email address. When called with url like:

    https://your_server.com/users/getUser?user=bob@test.com
  

the web service would return some JSON data the other system can parse e.g to show it on a web page or use it in some other way.

A possible workflow may look like this:

  1. Create process : "/users/getUser".
  2. Go to plugins and add the following plugins to toolbox:
    • Service: "HTTP web service" to be able to listen to HTTP/HTTPS calls.
    • Listener: "Http listener" to be able to run our process when the web service is called.
    • Common Object: "SQL connection" to set up a connection to our user database.
    • Input: "SQL select" to run a SQL select statement to get the user with a specific email.
  3. Add the web service "HTTP web service" to the same category our process is in and set it up to listen to incoming HTTP/HTTPS traffic on the web service ports. The ports were set up for DataMinder during installation and may be found in the Server/Config/DataMinder.properties file.
  4. Add the database connection plugin "SQL connection" in the same category as our process is in and set up the database connection.
  5. Go back to process and open Build dialog and start building the process by Drag’n’Drop the plugins into process
  6. Configure "SQL select" to send a SELECT statement to database to get a user with matching email address form the HTTP request.
  7. Finally connect and configure the listener "Http listener" by going go to "Schedulers/Listeners" and Drag’n’Drop listener on process.

When calling e.g. http://127.0.0.1:9080/users/getAll?email=alice.hill@test.com

The following would be returned:

{
"columns":["FirstName","LastName","Email"]
,
"data":[
["Alice","Hill","alice.hill@test.com"]
]
,
"results":[
{"index":"1","objectType":"Task","objectName":"SELECT FirstName, LastName, Email from UserDatabase.Users","objectId":"8F","result":{"status":"OK","info":""}},
{"index":"2","objectType":"Process","objectName":"Get user by email","objectId":"8B","result":{"status":"OK","info":""}}]}
  

To summarize : We set up a Web Service API in minutes that connects to a database to retrieve user data and send it back to the requesting service as JSON e.g. a web page.

If other formats or e.g. REST type of urls would be desired then different Service and Plugins may be used.

Back to top

Runnables

Runnables is where things happen

Runnables is the area where things actually happen. By building processes using plugins like:

you can create complex actions and flows.

Group categories in other categories . Add common objects and services to be shared by processes. Add processes with tasks, inputs and outputs to work on data. Let schedulers start processes based on time or listeners react to service calls.

In case you want to implement your own plugins you can easily do that in minutes by following instructions in the Plugin Development section.

Build a process

Build a process by adding plugins to build desired functionality e.g. plugins that read from a database and manipulates data and then writes to another database. Drag'n Drop plugins to arrange them in desired order.

To create a process a possible flow is

  1. Create process in Runnables
  2. Go to Plugins and add plugins to toolbox.
  3. Go back to Runnables and the process and use Build/Debug button to start building the process by Drag'n Drop the plugins from toolbox. The plugins can be moved between process and toolbox back and forth. They can be re-arranged. They can be moved or copied.
  4. Configure the plugins as required. You can run the process step by step running one plugin at a time and step back and forward in the run plugins list to watch data change.

Debug view

In debug view you can build and run a process task by task. You can see how the data changes by each task run. You can step back and forth in plugin run history. The data is cached for each plugin run meaning that already run tasks do not have to be rerun. If you want to clear the cache just reset the process run to no tasks run.

Maximum number of rows in debug view

When building a process and debugging we need only enough rows in table to understand what happens. To avoid much too large data sets, e.g. because of a database query, DataMinder is configured to only use a maximum number of rows during debug. To change the number of rows used in debug view set the dataminder.environment.engine.maxNumberOfRowsInDebugView property in DataMinder.properties file. Default is 200.

Tips and tricks

If you have a process that gets its initial data from some other system e.g. by a web service call. You may want to save the initial data and during debug re-create it. To create initial data you can:

  1. If the initial data is small and simple you can just create it with "Add values" task e.g. if the data is a simple id=5678.
  2. For more complex data e.g. when an entire form is posted you can create a special process that is called and it's only purpose is to export a table with the posted initial data. See the Export/Import table tasks.
    Now you can start to build your process and as the first task you import the previously exported table (with all data from e.g. web service call). In this way you can build your process step by step without having to deal with the complexity of external web service calls.

Run a Process

When the plugins are configured correctly and arranged in the desired order you may start a process in one of the following ways:

Manually

  1. Press Run button and run the process once. This use-case is common if you e.g. want to run a job just once or not regularly.
  2. Debug and Build the process by pressing the Build button and the run/build the process step by step in debug view.

Automatically

  1. By connecting a scheduler to the process. Get a scheduler in plugins or build your own and add it to the process. Configure the scheduler to start the process.
  2. With a listener and service. Get a service from Plugins and add it to a category or use an existing service. Add the listener to the process and connect it to the service. Now the service will notify the listener when some external event happens and starts the process. A service may provide the process with data e.g. from a web service call.

Back to top

Plugins

The building blocks of DataMinder

To find a plugin you can open the top config node or category under the Plugins tab. All plugins at and below that level are then shown. Then you can just do a search in your browser for e.g. "csv".

Collect the plugins you need by adding them to the Toolbox. Later you can Drag'n Drop them to processes and categories (depending on what plugin type it is) in Runnables..

Back to top

Monitor

Status of processes run

TODO

Back to top

Plugin Development: Build your own plugin libraries in 3 steps

Create your new plugin library within minutes

DataMinder was made with extensibility in mind. We don't know all the things you may want to do with it. We tried hard to make plugin development as easy as possible.

Choose the plugin type

You can implement any of the following plugins types depending on what you need to do. Each plugin type solves a different problem.

To create a new Plugin Library create a standard Java JAR file with the implementation of your plugins.

We provide an example implementation of all plugin types to get started.

You can use it as your first implementation.

Download it at our DOWNLOAD page.

Build your own plugin libraries in 3 steps

Create a new DataMinder plugin library

To start working with your plugins you need to add our public api jar-file to the project path:

{installation folder}/DataMinder/Server/PluginDevelopment/DMPublicAPI.jar

Now you can create a standard Java JAR (Java Archive) file which contains the following:

  1. Create Manifest file : META-INF/MANIFEST.MF
    It states the name of Plugin Library and points to the Plugin Library Information Class
    Manifest-Version: 1.0
    DATAMINDER-PLUGIN-LIBRARY: com.test.MyPluginLibrary
    DATAMINDER-PLUGIN-LIBRARY-NAME: My Plugins
        
  2. Create One Single Plugin Library Information Class returning a list of all available Plugins in this Library :
    Implement the DMPluginLibrary
    com.test.MyPluginLibrary
        
  3. Implement Plugins : implementing any of the plugin interfaces listed here
  4. Install your plugin library as a folder containing the jar file /Acme_plugins.jar. Install it at:

    {installation folder}/DataMinder/Server/Config/Plugins/Acme_plugins/Acme_plugins.jar

Sample implementation

We have provided a sample implementation of all plugin types. Please download the Sample Plugins implementation to get started. Contains an implementation of all plugin types.

DataMinder Plugin JavaDoc

To start develop DataMinder plugins start with the DataMinder Public Plugin API JavaDoc.

Generate Plugin Code

To make plugin creation more easy we created Plugin Code Generation pages online to to quickly generate a Plugin code + MANIFEST.MF file + DMPluginLibrary object to start developing.

Open Plugin Code Generator in new window

Back to top