INDEX
Explanations
phrases related to international politics and news
instances of non-standard characters or symbols representative of foreign languages or scripts
New Auto-Interp
Negative Logits
isode
-0.71
implant
-0.70
ronic
-0.70
accessory
-0.68
cane
-0.67
ellipt
-0.66
microphone
-0.66
yarn
-0.64
typew
-0.63
cand
-0.63
POSITIVE LOGITS
¬
1.22
ħ
1.18
Ĵ
1.12
²
1.04
£
1.04
ı
1.03
¹
1.02
ĭ
1.01
º
1.01
®
1.00
Activations Density 0.169%