INDEX
Explanations
names of individuals or entities
New Auto-Interp
Negative Logits
ACTED
-0.87
CHR
-0.78
senal
-0.72
PDATE
-0.69
ãĥ¼ãĥĨ
-0.68
à¨
-0.67
bulls
-0.65
OPLE
-0.63
conditioned
-0.62
expr
-0.62
POSITIVE LOGITS
andowski
1.47
insky
1.22
inski
1.00
enstein
0.99
ield
0.88
die
0.86
omon
0.85
itsch
0.84
gren
0.84
iky
0.84
Activations Density 0.016%