Gradient Boosting

Gradient Boosting refers to a machine learning method that belongs to the ensemble methods and is based on decision trees. Many small, weak learners, i.e. decision trees in this case, are estimated, which, averaged together, yield the final model.

Weak and Strong Learners

A weak learner is marginally better at classification than random guessing. “The motivation for boosting was a procedure that combines the outputs of many “weak” classifiers to produce a powerful “committee.”” (Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert (2008): The elements of statistical learning: Springer series in statistics New York, NY, USA (2), p.337). This creates a strong learner. In contrast to bagging, the decision trees in boosting are not independent of each other. In boosting the decision trees grow sequentially. Each decision tree is created using information from the previous one.  The algorithm of gradient boosting runs through the following steps.


  1. First, a weak decision tree with the target variable is applied to the entire data set. A random sample is then drawn and classified using the decision tree from step 1.
  2. Then the residuals of the classified observations from step 2 are calculated.
  3. In this step, the residuals from step 3 are used as target variables and another decision tree is estimated.
  4. The decision tree from step 4 is now combined with the previous tree (in subsequent steps: with the previous trees).
  5. The decision tree from step 4 is weighted with a learning rate.
  6. Step 2-5 is iteratively repeated k times.

From Weak to Strong Learners

Gradient boosting is also referred to as a slow learner. The overall model is iteratively improved by minimizing the residuals. This makes it clear that each decision tree is based on the previous ones. After a new decision tree is added, classifications are again estimated from a random sample and residuals are calculated. These are used to grow a next decision tree, which in turn is incorporated into the overall model with the learning rate. The lower the learning rate is chosen, the larger the number B of decision trees to be estimated should be. In the literature there are recommendation of values between 0.001 and 0.1. Summarizing, in gradient boosting the errors of the previous decision trees are explained by further decision trees. The classifications are corrected and improved with the help of the additional trees.

Code Snippet

import pandas as pd

from sklearn.metrics import confusion_matrix, classification_report, accuracy_score, recall_score

from sklearn.model_selection import cross_val_score

from sklearn.ensemble import GradientBoostingClassifier


gradientboosting = GradientBoostingClassifier(n_estimators = 1000, random_state = 42)

cv_scores = cross_val_score(gradientboosting, X_train, y_train, cv = 3, scoring = ‘recall’)


print(“Average 3-Fold CV recall score: {}”.format(np.mean(cv_scores))), y_train)

y_pred = gradientboosting.predict(X_test)

y_pred_proba = gradientboosting.predict_proba(X_test)[:,1]

The code snippet is written in the programming language Python and is based on the module scikit-learn.

For hands-on tutorials on how to build a random forest model, visit the data science blog TowardsDataScience.

More resources about machine learning

Data integration

How machine learning benefits from data integration
The causal chain “data integration-data quality-model performance” describes the necessity of effective data integration for easier and faster implementable and more successful machine learning. In short, good data integration results in better predictive power of machine learning models due to higher data quality.

From a business perspective, there are both cost-reducing and revenue-increasing effects. The development of the models is cost-reducing (less custom code, thus less maintenance, etc.). Revenue increasing is caused by the better predictive power of the models leading to more precise targeting, cross- and upselling, and more accurate evaluation of leads and opportunities – both B2B and B2C. You can find a detailed article on the topic here:


How to use machine learning with the Integration Platform
You can make the data from your centralized Marini Integration Platform available to external machine learning services and applications. The integration works seamlessly via the HubEngine or direct access to the platform, depending on the requirements of the third-party provider. For example, one vendor for standard machine learning applications in sales is Omikron. But you can also use standard applications on AWS or in the Google Cloud. Connecting to your own servers is just as easy if you want to program your own models there.

If you need support on how to integrate machine learning models into your platform, please contact our sales team. We will be happy to help you!


Frequent applications of machine learning in sales
Machine learning can support sales in a variety of ways. For example, it can calculate closing probabilities, estimate cross-selling and up-selling potential, or predict recommendations. The essential point here is that the salesperson is supported and receives further decision-making assistance, which he can use to better concentrate on his actual activity, i.e., selling. For example, the salesperson can more quickly identify which leads, opportunities or customers are most promising at the moment and contact them. However, it remains clear that the salesperson makes the final decision and is ultimately only facilitated by machine learning. In the end, no model sells, but still the human being.

Here you will find a short introduction to machine learning and the most common applications in sales.

Further articles