INDEX
Explanations
mentions of marriage and relationships
New Auto-Interp
Negative Logits
kyt
-0.15
bate
-0.15
eken
-0.15
swer
-0.15
dames
-0.14
ISCO
-0.14
Principal
-0.14
gec
-0.14
á»
-0.14
showc
-0.14
POSITIVE LOGITS
estr
0.21
ex
0.20
model
0.19
beau
0.18
-model
0.17
rum
0.17
model
0.17
cheating
0.16
rum
0.16
Estr
0.16
Activations Density 0.054%