INDEX
Explanations
references to significant public figures and their actions or statements
New Auto-Interp
Negative Logits
shack
-0.70
Belg
-0.68
Ambro
-0.67
Guinness
-0.67
polyg
-0.66
tremend
-0.65
sew
-0.65
Unic
-0.64
postal
-0.64
Shap
-0.63
POSITIVE LOGITS
¬
1.03
ħ
1.01
ĺ
0.97
¡
0.96
ı
0.96
¹
0.95
Ń
0.95
Ļ
0.94
¤
0.94
Ķ
0.93
Activations Density 0.124%