What is DataMinder?
DataMinder is a framework for building, running and monitoring processes that in some way modify data.
How does DataMinder work?
DataMinder stores data in a data table that is sent to each plugin in the current process to perform some work on it (e.g. clean up data), add new data
(e.g. from database, file, web service call)
or write data somewhere (e.g. database, file or mail).
How do I get DataMinder?
I don't read manuals. Get me to a minimal "Get started" further down the page
DataMinder key concepts
With DataMinder you can:
- Build processes by combining plugins form any number of plugin libraries
- Monitor processes when they run and afterword
- Develop new plugins and plugin libraries to suit your purposes
When building processes, you combine DataMinder plugins which are small, configurable building blocks that
implement some specific function. The following plugin types exist today:
-
Tasks manipulate data
send mail, set data to lower case, encrypt data etc.
-
Inputs get data
read a file, retrieve data from database etc.
-
Outputs output data
write data to file, write data to database etc.
-
Decisions change the flow
include another process, redirect flow to another process, check authentication etc.
-
Services listen to outside events
http/https web service calls, file modifications etc.
-
Listeners start a process when a service receives events
web service call, file is uploaded etc.
-
Schedulers start a process based on time
every hour, every Sunday at 04:00 etc.
-
Common Objects shared resources
database connections etc.
By combining the plugins into processes, DataMinder can be used to solve a lot of common problems facing programmers or IT administrators e.g.
- Setting up Web Service API to backend systems, e.g. to be called by a web layer to present data to end users or called in
a standardized way by other systems.
- Automate everyday tasks like running batch jobs
- Clean up and sync data. Make sure 2 data sources contain the same user data like phone numbers
All plugins are run and managed by DataMinder so you can focus only on the problem instead of spending energy on
just getting the surrounding systems to work.
By combining existing plugins into processes or building new plugins only your ingenuity and imagination set
the limit to what you can do with DataMinder.
Start building processes and/or develop your own plugin library
With DataMinder you can quickly build automated data flows by combining plugins into processes.
Processes can read/write data, modify data etc.
Key concepts:
- Process: Do something with data using 1 or more plugins
- Plugins: Are reusable building blocks that build processes
- Table: Data is imported to DataMinder as a table with rows and columns. Each plugin in the process
gets access to the table and can add/remove/modify data in it.
It can import/export data to/from external sources.
- Runnables: Where you create, manage and group processes into categories for easy management
The basic workflow:
- Decide what to do: Create a process that you intend to do something e.g. a web service or sync databases
- Collect the plugins: In the plugins library, get all plugins you need to implement the process
- Drag’n’Drop plugins to build the process: In the build window, assemble your process while debugging it step by step
What next?
With DataMinder you:
- Build: You build your processes using plugins in the plugin library (more).
- Develop: You create plugins to do specific tasks (more).
- Monitor: You monitor the status of all run processes and DataMinder (more).
Back to top
Minimal "Get started"
For people in a hurry
Give it less then 5 minutes and you will understand how to work with DataMinder.
The absolutely quickest way to start working with DataMinder is to build a very simple process with one task e.g. creating
unique uids (unique identification) and test run it. Let it be accessible by web service interface where a parameter
specifies the number of uids to create.
Create simple process with one task to generate uids and return them by web service call
- Go to plugins tab
and add the following 3 plugins to toolbox with
- DM Basic Plugins > Uid and counter > "Generate Uid"
- DM Basic Plugins > Services and Listeners > Http/https > "HTTP Web Service"
- DM Basic Plugins > Services and Listeners > Http/https > "HTTP Listener"
- Go to runnables tab.
Create
a category named : "My processes".
- On the category open to
Drag’n’Drop Service "HTTP Web Service" from toolbox to the category
and double click the service plugin to configure. Now you have activated a HTTP/HTTPS
web service.
- Allow only localhost = false
- Allow http from localhost = true
- Allow http from non-local host = true
- Name of secret key = (leave empty)
- Secret key value = (leave empty)
- Create a process
under category "My processes" named : "Get uids" (or whatever name you prefer).
- Go to process and open Scheduler/Listener dialog
to Drag’n’Drop Listener "HTTP Listener" and double click it to configure.
- Listens to service = HTTP Web Service (or name of the HTTP service above)
- Path to process = /getUids
- Go to process and open Build/Debug dialog
to Drag’n’Drop Task "Generate Uid" and double click it to configure.
- UID length = 20 (or whatever length you like)
- Column to store UIDs in = UID (or whatever column name you like)
- Number of values to create = mark the checkbox "Value from table" to indicate the value
will come from the web service call as a parameter.
- Call your process at the web service interface (default web service port 9080) to create 10 uids:
http://127.0.0.1:9080/getUids?Number of values to create=10
And check that the process returns JSON data:
{
"columns":["Number of values to create","UID"]
,
"data":[
["10","zwymETliyE2tsyJZcDDI"]
,["","X1tFU7PslEByoczSwBSg"]
,["","LKWwqESp6cVR0UxqCdSn"]
,["","cXT9vTQHUi7KDU7Nx5xm"]
,["","XU99g0CngBXVj7UBphXX"]
,["","fJSygZQAujlTIrNpESlT"]
,["","nYj8FDXSXH3Gm0rJkqla"]
,["","R6AQQX75kG6bJcTjGIX4"]
,["","c4qFMypOpZEdwIqAVMSq"]
,["","imr3TpTLUdPeQ7g9tzF5"]
]
,
"results":[
{"index":"1","objectType":"Task","objectName":"Generate Uid","objectId":"9R",
"result":{"status":"OK","info":""}},
{"index":"2","objectType":"Process","objectName":"Get uids as web service","objectId":"9T",
"result":{"status":"OK","info":""}}]}
If you need to call the process with another parameter e.g.
http://127.0.0.1:9080/getUids?count=10 you
can always add the rename columns task first
to rename the column from "count" to "Number of values to create" before calling "Generate Uid" task.
Please see: DM Basic Plugins > Table > "Rename Columns"
If you want to remove the input parameter "Number of values to create" from response you can
add the remove column task last in the process flow.
Please see: DM Basic Plugins > Table > "Remove table columns"
In case you jumped over the key concepts we recommend you to have a look at DataMinder key concepts.
Back to top
Install or Upgrade DataMinder
Install or upgrade DataMinder
If you have a previous version of DataMinder installed you should probably upgrade.
If not you should do a new install.
In this document we will refer to: /DataMinder root directory as {DM_ROOT}
Install Java (if you haven't already)
In order to run DataMinder there must be a Java runtime installed for your platform.
You need to install the latest Java 8 Runtime from:
http://www.oracle.com/technetwork/java/javase/downloads/index.html
If available install the Server runtime which is optimised for server environments.
You do not need the Java JDK which is the "Java Development Kit" containing tools you do not need running DataMinder.
After you installed Java runtime verify you have the correct version as:
java -version
The response should start with "1.8." and be similar to:
java version "1.8.0_162"
Install
NOTE : To install make sure that all ports are available
e.g. no previous DataMinder
or other program is using the ports. Ports under 1024 may require root or administrator access if they are to be used.
NOTE Windows: When installing on windows the
total path may not be more
then ca 256 character long.
Therefore we recommend to install DataMinder in a location with no more then
50-70 characters. More information here.
Download DataMinder DOWNLOAD.
-
Unpack the zip file named similar to DataMinder_v(...).zip.
{DM_ROOT} is the root /DataMinder directory.
- Move the {DM_ROOT} directory to where you want DataMinder to be installed.
- Open a terminal window and go to the installation root directory, {DM_ROOT}, where the
DataMinder.jar is.
- To start DataMinder run the command from {DM_ROOT} directory: java -jar DataMinder.jar
- Follow the instructions in the terminal window. The installation will start at the default port
http://127.0.0.1:8080
The port can be changed in the file if needed.
The installation will only be accessible from local host defined as ip address: 127.0.0.1.
- Follow the instructions in the browser to finish the installation.
Upgrade
NOTE : To update make sure that all ports are available
e.g. no previous DataMinder
or other program is using the ports. Ports under 1024 may require root or administrator access if they are to be used.
If you want the previous configuration available
in the new installation you may copy the /Server/Config folder. It contains all
the properties, runnables, external plugins etc.
The recommended flow for upgrade is:
- Shutdown any previous DataMinder to avoid port collisions. You can find the process number
(if it is running) in the pid file {DM_ROOT}/pid.txt
-
Unpack the zip file named like DataMinder_v(...).zip file.
-
Move the {DM_ROOT} directory to where you want DataMinder to be installed.
-
Open a terminal window and go to the installation root directory, {DM_ROOT}, where you unpacked the contents of zip file
/DataMinder where the DataMinder.jar is.
-
Rename the {DM_ROOT}/Server/Config directory in the new
installation to
{DM_ROOT}/Server/Config_original or something similar.
-
Copy the {DM_ROOT}/Server/Config directory from the previous
installation to the new one.
- To start DataMinder run the command from {DM_ROOT} directory: java -jar DataMinder.jar
- Login to the new DataMinder at the same https address
(default is https://127.0.0.1:8443/DM) and with the same password as before.
- Done!
Install DataMinder Licence
To install new licences go to the Monitor tab and open the licence management
dialog . Either upload a licence
file manually or add a licence download URL.
Java Cryptography Extension (JCE) Unlimited Strength [OPTIONAL]
To use strong encryption you need to update Java. The reason this is not included in Java by
default is that US has restrictions of which countries may download strong encryption components and
which may not.
If you like you can download the "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files"
here.
And follow the instructions given there.
Back to top
Build
Create your own solutions
In this part we will go through some more extensive examples of how you can build and configure DataMinder.
To illustrate how to work with DataMinder we start with 2 simple problems that show key concepts of how DataMinder works
and what you could do with it.
Disclaimer: The problems are intentionally very simple to illustrate the
workflow and concepts without getting stuck in implementation details.
Of course you could easily imagine real situations with similar (but more complex) problems.
Problem 1: Read a file and store the data in a database
Each day a CSV (comma-separated values) file is uploaded from our partner company to our ftp server with customer user data.
It contains columns FirstName and LastName of customers that registered with our partner. Our job is to import all those names into the
common customer database.
We need to make sure all names are in right case and to create email address for each user. And import the data to database.
The file (or similar) may be found in the DataMinder Internal Plugin Folder for "DM Basic Plugins" library as
{DM_ROOT}/Server/Internal/DMPlugins/DMPluginBasic/testdata/csv/users.csv and accessible
from the plugin configuration as: {INSTALL_DIR_ROOT}/testdata/csv/users.csv
FirstName | LastName |
ALICE | HILL |
bob | king |
ANNE | HaRT |
jOe | SimS |
Ieo | bASS |
Jane | RILEy |
LINDA | sUtton |
LaRRy | dAVIs |
MaRY | blake |
JacK | HOLT |
A possible workflow may look like this:
- Create a process: "Import users"
-
Go to plugins and add the following plugins to toolbox.
- Input: "CSV input" read from csv file
- Task: "First to upper" makes data in columns starting with uppercase and the rest with lower case
- Task: "Add values" create email based on values in table columns
- Task: "To lower case" make email address as lower case
- Common Object: "SQL connection" creates a reusable connection object to a database instance
- Output: "SQL insert" creates insert statements based on table data and sends them to database.
- Scheduler: "Interval scheduler" starts a process at specific intervals
To find a plugin you can open the top config node or category under the Plugins tab.
All plugins at and below that level are then shown. Then you can just do a search in your browser for e.g. "csv".
-
Set up the database connection by putting "SQL connection" plugin in the same category as our process is in.
-
Go back to process and open Build/Debug view and start building the process by Drag’n’Drop the plugins into process.
-
Configure "CSV input" plugin to read the CSV file, run the plugin with "Next ->" and verify the data was imported.
-
Fix case on the 2 columns FirstName and LastName with plugin "First to upper" and run it with "Next ->" and verify data.
-
Create emails with plugin "Add values" and store it in Email column.
-
Make email address all lowercase with plugin "To lower case".
-
Configure plugin "SQL insert" to use the database connection provided by plugin "SQL connection" and
create INSERT statements with all columns and send them to database with "SQL insert".
-
Run the process in Build/Debug view and verify it works.
-
As the last thing we want to run the process every hour since the file may get uploaded any time during the day.
To do that go to "Schedulers/Listeners" and Drag’n’Drop our scheduler "Interval scheduler" and configure it to start the process every hour.
To summarize : We just created a process that every hour check if a file exits. If the file exist it reads the contents and
saves it in database after fixing format and creating email addresses.
There are many more things we may need to do to have a production quality process e.g. remove file after read,
verify user or email is not already stored etc. The point here is we created a non-trivial workflow in minutes without
having to do any implementation!
Problem 2: Create simple Web Service API returning JSON (JavaScript Object Notation) data
We need to create a simple Web Service API that lets other systems get user data from database just by calling our service with an email address.
When called with url like:
https://your_server.com/users/getUser?user=bob@test.com
the web service would return some JSON data the other system can parse e.g to show it on a web page or use it in some other way.
A possible workflow may look like this:
-
Create process : "/users/getUser".
-
Go to plugins and add the following plugins to toolbox:
-
Service: "HTTP web service" to be able to listen to HTTP/HTTPS calls.
-
Listener: "Http listener" to be able to run our process when the web service is called.
-
Common Object: "SQL connection" to set up a connection to our user database.
-
Input: "SQL select" to run a SQL select statement to get the user with a specific email.
-
Add the web service "HTTP web service" to the same category our process is in and set it up to listen to
incoming HTTP/HTTPS traffic on the web service ports.
The ports were set up for DataMinder during installation and may be found in the Server/Config/DataMinder.properties file.
- Add the database connection plugin "SQL connection" in the same category as our process is in
and set up the database connection.
-
Go back to process and open Build dialog and start building the process by Drag’n’Drop the plugins into process
-
Configure "SQL select" to send a SELECT statement to database to get a user with matching email address form the HTTP request.
-
Finally connect and configure the listener "Http listener" by going go to "Schedulers/Listeners" and Drag’n’Drop
listener on process.
When calling e.g. http://127.0.0.1:9080/users/getAll?email=alice.hill@test.com
The following would be returned:
{
"columns":["FirstName","LastName","Email"]
,
"data":[
["Alice","Hill","alice.hill@test.com"]
]
,
"results":[
{"index":"1","objectType":"Task","objectName":"SELECT FirstName, LastName, Email from UserDatabase.Users","objectId":"8F","result":{"status":"OK","info":""}},
{"index":"2","objectType":"Process","objectName":"Get user by email","objectId":"8B","result":{"status":"OK","info":""}}]}
To summarize : We set up a Web Service API in minutes that connects to a database to retrieve user data and
send it back to the requesting service as JSON e.g. a web page.
If other formats or e.g. REST type of urls would be desired then different Service and Plugins may be used.
Back to top
Runnables
Runnables is where things happen
Runnables is the area where things actually happen. By
building processes
using plugins like:
-
Tasks manipulate data
send mail, set data to lower case, encrypt data etc.
-
Inputs get data
read a file, retrieve data from database etc.
-
Outputs output data
write data to file, write data to database etc.
-
Decisions change the flow
include another process, redirect flow to another process, check authentication etc.
-
Services listen to outside events
http/https web service calls, file modifications etc.
-
Listeners start a process when a service receives events
web service call, file is uploaded etc.
-
Schedulers start a process based on time
every hour, every Sunday at 04:00 etc.
-
Common Objects shared resources
database connections etc.
you can create complex actions and flows.
Group
categories
in other
categories
. Add
common objects
and
services
to be shared by processes. Add
processes with tasks,
inputs and outputs to work on data. Let
schedulers start processes based on time or listeners
react to service calls.
In case you want to implement your own plugins you can easily do that in minutes by following instructions in
the Plugin Development section.
Build a process
Build a process by adding plugins
to build desired functionality e.g. plugins that read from a database and manipulates data and then writes to another database.
Drag'n Drop
plugins to arrange them in desired order.
To create a process a possible flow is
-
Create process
in Runnables
-
Go to Plugins and
add plugins to toolbox.
-
Go back to Runnables and the process and use Build/Debug button to start building the process by
Drag'n Drop the plugins from toolbox.
The plugins can be moved between process and toolbox back and forth. They can be re-arranged. They can be moved or copied.
-
Configure the plugins as required. You can
run the process step by step running one plugin at a time
and step back and forward in the run plugins list to watch data change.
Debug view
In debug view you can build and run a process task by task. You can see how the data changes by each task run.
You can step back and forth in plugin run history. The data is cached for each plugin run meaning that already run tasks do not have to be rerun.
If you want to clear the cache just reset the process run to no tasks run.
Maximum number of rows in debug view
When building a process and debugging we need only enough rows in table to understand what happens. To avoid
much too large data sets, e.g. because of a database query, DataMinder is configured to only use a maximum number of rows during debug.
To change the number of rows used in debug view set the
dataminder.environment.engine.maxNumberOfRowsInDebugView property in DataMinder.properties file. Default is
200.
Tips and tricks
If you have a process that gets its initial data from some other system e.g. by a web service call.
You may want to save the initial data and during debug re-create it. To create initial data you can:
- If the initial data is small and simple you can just create it with "Add values" task e.g. if the data
is a simple id=5678.
- For more complex data e.g. when an entire form is posted you can create a special process that is called and it's only purpose
is to export a table with the posted initial data. See the Export/Import table tasks.
Now you can start to build your process and as the first task you import the previously exported table (with all data from e.g. web service call).
In this way you can build your process step by step without having to deal with the complexity of external web service calls.
Run a Process
When the plugins are configured correctly and arranged in the desired order you may start a process in one of the following ways:
Manually
-
Press
Run button
and run the process once. This use-case is common if you e.g. want to run a job just once or not regularly.
-
Debug
and Build the process by pressing the Build button and the run/build the process step by step in debug view.
Automatically
-
By connecting a
scheduler
to the process. Get a scheduler in plugins or build your own and add it to the process. Configure the scheduler to start the process.
-
With a
listener and service.
Get a service from Plugins and add it to a category or use an existing service. Add the listener to the process and connect it to the service.
Now the service will notify the listener when some external event happens and starts the process. A service may provide the process with data e.g.
from a web service call.
Back to top
Plugins
The building blocks of DataMinder
To find a plugin you can open the top config node or category under the Plugins tab.
All plugins at and below that level are then shown. Then you can just do a search in your browser for e.g. "csv".
Collect the plugins you need by adding them to the Toolbox. Later you can
Drag'n Drop them to processes and categories (depending on what plugin type it is) in Runnables..
Back to top
Monitor
Plugin Development: Build your own plugin libraries in 3 steps
Create your new plugin library within minutes
DataMinder was made with extensibility in mind. We don't know all the things you may want to do
with it. We tried hard to make plugin development as easy as possible.
Choose the plugin type
You can implement any of the following plugins types depending on what you need to do. Each plugin type solves a different problem.
-
Tasks manipulate data
send mail, set data to lower case, encrypt data etc.
-
Inputs get data
read a file, retrieve data from database etc.
-
Outputs output data
write data to file, write data to database etc.
-
Decisions change the flow
include another process, redirect flow to another process, check authentication etc.
-
Services listen to outside events
http/https web service calls, file modifications etc.
-
Listeners start a process when a service receives events
web service call, file is uploaded etc.
-
Schedulers start a process based on time
every hour, every Sunday at 04:00 etc.
-
Common Objects shared resources
database connections etc.
To create a new Plugin Library create a standard Java JAR
file with the implementation of your plugins.
We provide an example implementation of all plugin types
to get started.
You can use it as your first implementation.
Download it at our DOWNLOAD page.
Build your own plugin libraries in 3 steps
Create a new DataMinder plugin library
To start working with your plugins you need to add our public api jar-file to the project path:
{installation folder}/DataMinder/Server/PluginDevelopment/DMPublicAPI.jar
Now you can create a standard Java JAR (Java Archive) file which contains the following:
-
Create Manifest file : META-INF/MANIFEST.MF
It states the name of Plugin Library and points to the Plugin Library Information Class
Manifest-Version: 1.0
DATAMINDER-PLUGIN-LIBRARY: com.test.MyPluginLibrary
DATAMINDER-PLUGIN-LIBRARY-NAME: My Plugins
- Create One Single Plugin Library Information Class returning a list of all
available Plugins in this Library :
Implement the DMPluginLibrary
com.test.MyPluginLibrary
-
Implement Plugins : implementing any of the plugin interfaces listed
here
-
Install your plugin library as a folder containing the jar file
/Acme_plugins.jar. Install it at:
{installation folder}/DataMinder/Server/Config/Plugins/Acme_plugins/Acme_plugins.jar
Sample implementation
We have provided a sample implementation of all plugin types.
Please download the Sample Plugins implementation to get started. Contains an implementation of all plugin types.
DataMinder Plugin JavaDoc
To start develop DataMinder plugins start with the DataMinder Public Plugin API JavaDoc.
Generate Plugin Code
To make plugin creation more easy we created Plugin Code Generation pages online to to quickly
generate a Plugin code + MANIFEST.MF file + DMPluginLibrary object to start developing.
Open Plugin Code Generator in new window
Back to top
Downloading .... Please checkout download folder when the download finishes
Downloading .... Please checkout download folder when the download finishes