INDEX
Explanations
information about people's personal lives, relationships, and experiences
New Auto-Interp
Negative Logits
inel
-0.90
Init
-0.79
Aren
-0.76
lt
-0.74
Rules
-0.72
steps
-0.72
iterator
-0.71
FIG
-0.71
adj
-0.70
Events
-0.70
POSITIVE LOGITS
knack
1.32
penchant
1.32
girlfriend
1.15
beard
1.09
tendency
1.09
propensity
1.08
lot
1.05
reputation
1.05
mustache
0.98
wife
0.96
Activations Density 0.180%