INDEX
Explanations
mentions of large entities or institutions
New Auto-Interp
Negative Logits
manship
-0.79
Dialogue
-0.69
chron
-0.68
cia
-0.68
yrinth
-0.68
Emin
-0.67
qi
-0.67
istry
-0.65
CF
-0.65
gemony
-0.65
POSITIVE LOGITS
oted
1.15
chunk
1.05
gest
1.03
scale
1.01
chunks
0.98
swath
0.92
intestine
0.92
swat
0.90
quantities
0.87
sized
0.85
Activations Density 1.377%