INDEX
Explanations
parenthetical references and annotations related to individuals
New Auto-Interp
Negative Logits
å·
-0.15
ège
-0.15
Crisis
-0.14
ÑĢаз
-0.14
cuer
-0.14
efeller
-0.14
xcd
-0.14
nc
-0.13
iquer
-0.13
/spec
-0.13
POSITIVE LOGITS
born
0.30
191
0.30
190
0.29
188
0.27
186
0.27
192
0.26
185
0.25
189
0.24
183
0.24
184
0.23
Activations Density 0.055%