Comparing the Information Gain of Eggertopia Scores and Your Model Both the Eggertopia Scores and your binary classification model can be thought of as tools to reduce uncertainty about future default outcomes of credit card applicants. Your own model, deeloped in !art ", identifies dependencies between, on the one hand, the si# types on input data collected by the ban$, and on the other hand, the binary outcome default%no default. If we assume that the dependencies identified by Eggertopia Scores and by your model on the &est Set are stable and representatie of all future data 'a big assumption( we can draw some further conclusions about how much information gain, or reduction in uncertainty, is proided by each. )efinitions are gien in the Information Gain Calculator Spreadsheet, proided below. Information Gain Calculator.#ls# *uestion+ *uestion+ n your model� s &est Set results, results, what is the condition conditional al entropy of default, gien your test classificationsint+ you need need your model� s true positie rate from from !art !art ", *uestion *uestion "/, and � test incidence incidence� 0proportion 0proportion of eents your your model classifies classifies as default1 default1 from from !art ", 2uestion "3. "3. 4se the condition condition incidence incidence of /56 and your your model� s &rue !ositie !ositie rate to calculate the portion of &!s. &hen you hae the inputs needed to use the Information Gain Calculator Spreadsheet. "
r
7ecall that the entropy of the original base rate, minus the conditional entropy of default gien your test classification, e2uals the Mutual Information between default and the test. I'89Y( : '8(
�
'8;Y(.
&he population of potential credit card customers consists of /56 future defaulters. &he base rate incidence of default './5, .<5( has an uncertainty, or entropy, of './5, .<5( : ./5=log> ? .<5=log".333 : .@""3 bits. *uestion+ n your test set results, what is the Mutual Information, or information Gain, in aerage bits per eent"
r
7ecall that !ercentage Information Gain '!.I.G.( is the ratio of I'89Y(%'8(. *uestion+ on your &est Set results, what is the !ercentage Information Gain '!.I.G.( of your model-
"
r
Since you hae, for you model on the &est Set, a saingsAperAeent, and a bitsAperA eent 'Mutual Information( you can calculate a saingsAperAbit. &his is a powerful concept, because it places a financial alue directly on the information content of a model 'or additional data source, li$e the Eggertopia scores(. *uestion+ ow many dollars does the ban$ sae, for eery bit of information gain achieed by your model"
r
Information Gain of Eggertopia Scores oer the Base 7ate or 2uestions in this section, assume your model and the data it uses are not aailable � the ban$� s choice is between Eggertopia scores and the base rate. *uestion+ hat is the Mutual Information of the Eggertopia ScoresIn other words, on the &est Set, hat is the information gain, in aerage bits per eent, oer the base rate of './5, .<5( offered by the Eggertopia Scores."3D5 bits per eent
#
n the test set, what is the Eggertopia scores� !ercentage Information Gain '!IG("5./56
#
If Eggertopia data were free, and your model was unaailable, what would the dollar saings per bit of information e#tracted be)ollar saings are >"/ rounded to the nearest dollarA from 2uiF /, 2uestion Halue would be >/< per bit.
#
Incremental Information Gain of Eggertopia Scores Compared to Your Model and ailable )ata 'any answer scores( 'or this section, assume your Model and the )ata it uses are aailable(. *uestion+ hat is the incremental information gain of the Eggertopia scores, oer your model from !art ", in aerage bits per eent, if any" r
hat is the ma#imum 'brea$Aeen( price the ban$ should pay for Eggertopia scores, per score, if your model from !art " and data are already aailable"
r
t the aboe ma#imum 'brea$Aeen( price per score, what would be the alue per bit of incremental information gained from the Eggertopia scores- Gie your answer in %bit. "
r