INDEX
Explanations
country names followed by a symbol related to news or politics
certain symbols or characters that indicate specific categories of reports or statements regarding nations
New Auto-Interp
Negative Logits
blond
-0.67
isode
-0.66
giveaways
-0.66
microphone
-0.66
yarn
-0.66
giveaway
-0.65
ronic
-0.65
treadmill
-0.64
litter
-0.64
discs
-0.63
POSITIVE LOGITS
¬
1.15
£
1.07
¹
1.01
º
0.98
ħ
0.98
¸
0.96
²
0.96
AFP
0.95
®
0.95
ı
0.95
Activations Density 0.246%