INDEX
Explanations
names of specific individuals or entities
names of key individuals associated with specific events or narratives
New Auto-Interp
Negative Logits
ordinary
-0.77
-0.77
rowth
-0.65
ambers
-0.65
Band
-0.65
NX
-0.65
Quote
-0.63
intendent
-0.62
hens
-0.62
incre
-0.61
POSITIVE LOGITS
Saul
1.05
Berman
0.92
Nab
0.76
Goodman
0.76
Faul
0.72
scripts
0.71
Judah
0.71
tering
0.71
kj
0.70
Boone
0.70
Activations Density 0.028%