INDEX
Explanations
mentions of specific names or entities
instances of indirect references or symbols related to specific entities or names
New Auto-Interp
Negative Logits
lda
-0.74
parity
-0.67
Kenyan
-0.66
iliar
-0.64
scatter
-0.64
condol
-0.64
tremend
-0.63
dust
-0.63
ioned
-0.62
scattering
-0.61
POSITIVE LOGITS
£
0.99
¹
0.98
âĹ¼
0.88
ħĭ
0.84
ı
0.83
İ
0.83
į
0.83
Į
0.82
¬
0.81
Phill
0.80
Activations Density 0.349%