Introduction Part 2 is intended to illustrate how binary classification performance metrics make it possible for you to put an exact value, in dollars per event, on new information that relates to a predictive model. Note that new information will be worth far more if it is compared to no forecasting model rather than the state of partial knowledge available from the current model. Sellers of information and data science consultants!" love to take credit for any information gain they achieve over the base rate. #ery often some intermediate state of knowledge is already available for which no additional spending is re$uired. %valuating the realistic incremental financial gain from new information, whether licensing a third&party commercial database or collecting new data internally, is therefore of great practical value, as this sets an upper bound on what your 'ompany should be willing to pay to license or create the new information. In this case study, your boss has been in discussions with an advanced machine& learning predictive&analytics credit&risk analytics company that claims to score individual individual probability probability of default default with very high high information information gain. (et (et� s call the company %ggertopia. %ggertopia sales representatives claim their pre&processed risk&scores can achieve )*' values as high as .+ or even higher. -owever, %ggertopia scores are sold per&event, and they are expensive! our boss asks you to determine the incremental financial value to the bank of purchasing %ggertopia risk scores on future credit&card applicants. %ggertopia agrees to apply its algorithms to generate credit scores for the /00 individuals in the 1raining and 1est Sets. %ggertopia scores do not need to be combined with anything else to make a model. -owever, since the scores range from approximately &00 best credit risk" to /300 most likely to default" they will need to be standardi4ed and ad5usted to fit the &6. to 6. range of the )*' 'alculator Spreadsheet below" )*'7'alculator and 8eview of )*' 'urve.xlsx ou will determine the sustainable )*' of the %ggertopia scores, the sustainable cost&per&event, and the savings per event, when comparing %ggertopia data to the base rate forecast. ou will then calculate the incremental savings per event if you compare use of %ggertopia data to use of your current model developed in Part 9. :uestion; ive your answer to two digits to the right of the decimal point. .+
r
?/0 ?00 ?/0
x x x
r
*sing the same threshold as used on the training set, what is the cost per event of the %ggertopia scores on the 1est Set= 8ound to the nearest dollar. ?+6+
r
If the bank did not have your model, or any other way of forecasting default, what is the maximum break&even" price per event that the bank could theoretically pay for %ggertopia scores= In other words, what are %ggertopia� s scores� absolute savings&per&event= -int; 'alculate the difference between the cost&per&event at a 2@ default rate, and the cost&per&event using %ggertopia scores ?/9+ ?/26 ?/92
x x r
r
ain Spreadsheet, BPP# is defined at 'ell >/9C, obtain 1rue Positive and Dalse Positive 8ates from the )*' 'alculator Spreadsheet, and use algebra to solve. Information >ain 'alculator.xlsx .0 ./ .2
x x x
Incremental Dinancial #alue of %ggertopia Scores ou calculated a cost per event for your own predictive model on 1est Set data to answer :ui4 9 & Part 9, :uestion .
Incremental Dinancial #alue of %ggertopia Scores ou calculated a cost per event for your own predictive model on 1est Set data to answer :ui4 9 & Part 9, :uestion . :uestion; )ssuming that the performance of the %ggertopia model and your model both remain stable on any future data a big assumption", what is the maximum, or break& even, price that the bank could pay per score for %ggertopia, given that it already has your model and data= 200 r