Linear Regression

Linear regression is one of the most widely used analysis methods from statistics and machine learning. It examines the linear relationship between a (dependent) target variable (Y) and one or more independent variables (X). More precisely, one tries to estimate the influence of X on Y by means of linear regression. For each regressor, a coefficient βi is estimated, usually using ordinary least squares (OLS). The coefficients describe the influence strength (magnitude of the coefficient) and influence direction (sign of the coefficient).

Applied to the sales context, linear regression can be used to objectively estimate the potential value of leads (measured in expected sales in EUR), for example. The value determined in this way can in turn be used for lead prioritization by sales employees and thus systematically support lead processing that maximizes sales. The basis for this are sales volumes (in EUR) from previous periods and the corresponding metrics (X), such as the size of the customer company, address of the company, number of employees, and sales revenue.

Code Snippet

from sklearn.linear_model import LinearRegression

reg = LinearRegression().fit(X_train, y_train)


The code snippet is written in Python and based on the module scikit-learn.

Exemplary Use Case


Potential value of the lead (in EUR) = β0 + β1*Company size + β2*Company sales – β3*Distance to the company

The larger the company and the higher the turnover of the company, the higher the potential value of the lead, since larger companies usually have a higher demand than smaller companies. In the manufacturing industry, the distance to the company can have a negative influence on the potential value of the lead, since transport costs increase with increasing distance and the expected demand can in turn be negatively influenced.

With the help of linear regression, the strength of influence of the respective regressors (X) on the potential value of the lead (Y) can now be determined and calculated in the form of regression coefficients. After calculating the corresponding regression coefficients based on historical data, this information can now be used for forecasting. For example, if a potential customer fills out a form on your website and enters the relevant information, the expected value of the lead can be calculated and made available directly to the sales employee in the background. In many cases, the company name is sufficient and the remaining information can be supplied from external sources.

Note that linear regression is suitable for estimating metrically scaled target variables. For binary target variables such as purchase vs. non-purchase or click vs. non-click, for example, logistic regression or a decision tree is used.

Linear regression grouped by binary variable
This plot shows two linear regression lines that are grouped by an arbitrary binary variable. The light area indicates the confidence intervals.

The regression plot above could depict the linear relationship between two variables, for example potential value of the lead (Y-axis) and Company Size (X-Axis) with the colours describing a categorical variable such as country.

More resources about machine learning

Data integration

How machine learning benefits from data integration
The causal chain “data integration-data quality-model performance” describes the necessity of effective data integration for easier and faster implementable and more successful machine learning. In short, good data integration results in better predictive power of machine learning models due to higher data quality.

From a business perspective, there are both cost-reducing and revenue-increasing effects. The development of the models is cost-reducing (less custom code, thus less maintenance, etc.). Revenue increasing is caused by the better predictive power of the models leading to more precise targeting, cross- and upselling, and more accurate evaluation of leads and opportunities – both B2B and B2C. You can find a detailed article on the topic here:


How to use machine learning with the Integration Platform
You can make the data from your centralized Marini Integration Platform available to external machine learning services and applications. The integration works seamlessly via the HubEngine or direct access to the platform, depending on the requirements of the third-party provider. For example, one vendor for standard machine learning applications in sales is Omikron. But you can also use standard applications on AWS or in the Google Cloud. Connecting to your own servers is just as easy if you want to program your own models there.

If you need support on how to integrate machine learning models into your platform, please contact our sales team. We will be happy to help you!


Frequent applications of machine learning in sales
Machine learning can support sales in a variety of ways. For example, it can calculate closing probabilities, estimate cross-selling and up-selling potential, or predict recommendations. The essential point here is that the salesperson is supported and receives further decision-making assistance, which he can use to better concentrate on his actual activity, i.e., selling. For example, the salesperson can more quickly identify which leads, opportunities or customers are most promising at the moment and contact them. However, it remains clear that the salesperson makes the final decision and is ultimately only facilitated by machine learning. In the end, no model sells, but still the human being.

Here you will find a short introduction to machine learning and the most common applications in sales.

Further articles