Knime: Buidling a logistic regression model Part 1

by gatekeeper on 03.08.2019

Knime is a very powerful platform for data science. The program enables the integration of analytical concepts through the module pipeline concept.

The best approach to start working with the platform is to take an example. In the business world, public services and other fields there are core question, that have a “Yes/No” answer (dichotomous outcomes). Here are few possible issues:

  • What are the drivers of a product recommendation (recommended/not recommended)?
  • What are the drivers of churn (customers / customers who left within the last month)?
  • What are qualities or characteristics to be become a city mayor (mayor/unsuccessful candidates)?

The final query listed will be discussed below — “How to become a city mayor?”.

This is the dataset will be used as an input for the analysis. The statistical approach is based on the logistic regression. Normally, logistic regression is well suited for describing and testing hypotheses about relationships between a categorical outcome variable and one or more categorical or continuous predictor variables.

Colored in green is the output variable, which delivers the answer to the question “Did the candidate become a city mayor ?” based on historical record of candidates for this position. All other variables (e.g. past occupation, age) colored in blue, are independent variables, that could explain the resulting variable.

This is the an overview of the prediction model in Knime. In the next post, we will go in depth trough the nodes in order understand how the model can generate reliable insights about the qualities of the successful city mayor.


2 Responses to Knime: Buidling a logistic regression model Part 1

    Stanimir says:

    Hello, I have also a similar task to solve. How do you check the accuracy of the model?

    Klaus F says:

    Could you please upload the dataset?

Leave a Reply

Your email address will not be published. Required fields are marked *