Wara Alfa Syukrilla

Djarum Beswan Blogs site

Hey there! Thanks for dropping by Theme Preview! Take a look around
and grab the RSS feed to stay updated. See you around!

REGRESI LOGISTIK DATA STATUS KREDIT

“ikatlah ilmu dengan menulisnya”

Dalam rangka mengikat ilmu itu, pada postingan ini saya akan menampilkan penyajian eksplorasi data dan pemodelan regresi logistik tentang status kredit berdasarkan umur, gender, jumlah tanggungan anak, dan status kepemilikan rumah.

Perangkat lunak yang digunakan adalah R.

Syntax yang saya gunakan adalah

data<-read.csv(“datascoring.csv”)
summary(data)
data<-data[,-1]
head(data)
hist(data$Age)
tabGen<-table(data$Gender)
barplot(tabGen)
tabStat<-table(data$status)
barplot(tabStat)
tabRes<-table(data$Residence.Ownership)
barplot(tabRes)
tabDep<-table(data$number.of.dependants)
barplot(tabDep)

prop.table(table(data$status,data$Residence.Ownership))
prop.table(table(data$status,data$Residence.Ownership),margin=1)
prop.table(table(data$status,data$Residence.Ownership),margin=2)
tab.prop.stat.res<-prop.table(table(data$status,data$Residence.Ownership),margin=2)
barplot((tab.prop.stat.res),col=c(6,5),main=”Proporsi Status Kredit bedasarkan Kepemilikan RUmah”)

tab.prop.stat.dep<-prop.table(table(data$status,data$number.of.dependants),margin=2)
barplot((tab.prop.stat.dep),col=c(6,5),beside=T,main=”Proporsi Status Kredit bedasarkan Jumlah Tanggungan”)

tab.prop.stat.gen<-prop.table(table(data$status,data$Gender),margin=2)
barplot((tab.prop.stat.gen),col=c(6,5),main=”Proporsi Status Kredit bedasarkan Gender”)

boxplot((data$Age~data$status),main=”Proporsi Status Kredit bedasarkan Umur”)
plot(density(data$Age[data$status==”GOOD”]),main=”Proporsi Status Kredit bedasarkan Umur”,col=8)
lines(density(data$Age[data$status==”BAD”]),col=2)

model.logistik<-glm(status~ . , data=data, family=”binomial”)
summary(model.logistik)
prob.prediksi<-predict(model.logistik, data, type=”response”)
prediksi0.5, “GOOD”, “BAD”)
data$status
library(caret)
confusionMatrix(prediksi, data$status)

 

Hasil eksplorasi data:

Grafik Batang tiap Variabel

Grafik Batang tiap Variabel

Eksplorasi data

Eksplorasi data

Hasil Regresi Logistik

Call:
glm(formula = status ~ ., family = “binomial”, data = data)

Deviance Residuals:
Min 1Q Median 3Q Max
-2.3984 -0.5508 0.2635 0.5573 2.2843

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.93874 0.43123 -2.177 0.02949 *
Age 0.10097 0.01147 8.804 < 2e-16 ***
GenderMALE -2.05341 0.13877 -14.797 < 2e-16 ***
Residence.OwnershipOWNED 2.54339 0.21773 11.682 < 2e-16 ***
Residence.OwnershipPARENTS 0.61195 0.26308 2.326 0.02001 *
Residence.OwnershipRENT -0.65307 0.21207 -3.080 0.00207 **
number.of.dependants -0.63054 0.04020 -15.685 < 2e-16 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 2900.7 on 2289 degrees of freedom
Residual deviance: 1787.9 on 2283 degrees of freedom
AIC: 1801.9

 

Pengecekan Akurasi Data

Confusion Matrix and Statistics

Reference
Prediction BAD GOOD
BAD 498 185
GOOD 255 1352

Accuracy : 0.8079
95% CI : (0.7911, 0.8238)
No Information Rate : 0.6712
P-Value [Acc > NIR] : < 2.2e-16

Kappa : 0.5541
Mcnemar’s Test P-Value : 0.001004

Sensitivity : 0.6614
Specificity : 0.8796
Pos Pred Value : 0.7291
Neg Pred Value : 0.8413
Prevalence : 0.3288
Detection Rate : 0.2175
Detection Prevalence : 0.2983
Balanced Accuracy : 0.7705

‘Positive’ Class : BAD

 

You can leave a response, or trackback from your own site.

Leave a Reply

Before you post, please prove you are sentient.

what is 9 + 2?