Process mining describes the systematic analysis, validation, modelling and visualisation of processes using stored log entries. Originally, process mining was developed to optimise the efficiency of processes. Processing times were to be optimised and bottlenecks and dead ends (not intended and defined process ends) were to be identified. Process mining uses statistical and mathematical methods.
We identify three application areas of process mining: analysis, validation and visualisation.
An examplary process
We use the licence-free and standardised notation language BPMN to model processes. This makes it possible to design processes across systems and to identify bottlenecks as well as breaking points. This is done a priori. Therefore, the bottlenecks and breaking points are only hypothesised. Process mining helps to analyse, validate, model and visualise processes. In the following, we illustrate the different aspects of process mining using the following freely imagined process.
The above freely imagined and simplified process of developing a sales opportunity in the sales funnel serves as an illustration in the following. An opportunity is created and developed in step 1 with lead nurturing. This is a so-called collapsed task, i.e. it contains a sub-process. This could consist of several automated emails or sales letters. Once the lead with its opportunity is nurtured, the status of the opportunity is set to “qualifying” (2). A follow-up call (3) then takes place. If there is no more interest, the process moves to step 8 and the opportunity is closed with the status “Closed Lost”. If, on the other hand, there is still interest, the status is set to “Needs Analysis” (4). After a certain time has elapsed (t = 14 days), another follow-up call takes place (5). This is followed by another gateway. If there is no more interest, step 8 follows. On the other hand, an offer is made and sent. The offer can either be accepted (step 7) or rejected (step 8).
Feedback loops or negotiation rounds and further complexity have been omitted for illustration purposes.
Data mining for processes
Process mining can be used to validate processes. This means that reality can be compared with its modelling and discrepancies can be identified. Do processes really run as they are modelled? In our case, for example, it might turn out that many instances (sales opportunities) remain at the “Qualyfing” or “Needs Analysis” status without ever reaching the process end. This is called a Dead End. Or sales opportunities are directly entered as “Closed Lost” or “Closed Won”. The validation would show whether the sales people in our case actually use the system as expected according to the process. After validation with the help of process mining, the process could be adapted accordingly or the behaviour of the process participants could be changed. In summary, process mining for process validation compares reality with its modelled version.
- What does the process actually look like?
- Where are the so-called dead ends?
- Which paths are not used?
The graphic below shows two simulated example processes A and B. Process A describes the modelled process, whereas process B represents reality. In process A it can be seen that the process does not foresee steps (A,B,C,D) being walked backwards. Process B, on the other hand, shows that in reality steps are taken from C to A or from B to A. Such visualisations therefore show discrepancies very clearly. Non-modelled processes can also be visualised.
Processes generate data. And lots of data at that. Often this data is only considered in a very reduced way, i.e. the most important states are taken out and stored as data. In our process, the current state would probably be saved or the final state saved. Often, interactions between salespeople and customers are recorded, such as the follow-up calls in this case. Generally, states are often stored, but not changes. However, questions like: “How many sales opportunities end up in Closed Lost coming from the different sales stages?” are very relevant information. Process adjustments could be derived from such information. Often, an analysis of processing times is just as interesting. How long are the processing times of the different process stages? All this information can be used to generate insights into process flows and make processes more efficient. Based on this, processes can be adapted.
- Which process paths are used most often? Which ones are not used at all?
- What are the throughput times? In individual process sections?
- Are there bottlenecks?
The dynamic visualisation of processes (as static in BPMN) enables a supplement to conventional dashboards and graphics such as bar charts or line graphs. Individual process instances (= runs, in our case sales opportunities) can be visualised and it is visible in which stage of the process they are at (graphic XY). On an aggregated level, the sales funnel could alternatively be visualised in our case and it would be directly visible how many instances (sales opportunities) are in which process step. The visualisation of processes enables a representation of complex relationships that would not be possible in conventional diagrams and graphs. In addition, significantly more information is visible at a glance. A good visualisation of the discrepancy can be found on TowardsDataScience.
- Where is a process instance currently at in the process?
- Where are all process instances currently at in the process (aggregated)?
- Which process paths are frequently used?
- What is the usual process flow?
- What are average throughput times for different process stages?
- How many process instance runs were there in total in the different paths (in time period XY)?
The prerequisite: event logs
Event sourcing is the absolutely necessary basis for process mining. Event sourcing is the recording of changes in states. In terms of data, this means that the changes within the data are recorded, also called logged (event logs). This means that the history of change is clearly visible. In relation to our process, this would mean that every change in the sales stage is recorded and stored in an event log.