INDEX
Explanations
text written in a non-English language, specifically featuring the character "ä"
instances of a specific character or symbol
New Auto-Interp
Negative Logits
ORED
-0.82
Sussex
-0.68
Jericho
-0.67
Mayweather
-0.66
IFIED
-0.66
Bullets
-0.63
Hodg
-0.60
################
-0.59
Asians
-0.58
Notting
-0.58
POSITIVE LOGITS
ä
1.27
inen
1.18
¢
1.10
ternity
1.05
·
0.96
¯¯¯¯
0.93
¶
0.93
ë
0.91
ï
0.90
ö
0.88
Activations Density 0.012%