INDEX
Explanations
texts written in a specific foreign language
characters and symbols from various encoded scripts
New Auto-Interp
Negative Logits
espie
-0.94
womb
-0.68
ttes
-0.67
Murdoch
-0.67
treasury
-0.66
Treasury
-0.66
PD
-0.64
enegger
-0.63
Coulter
-0.63
Wink
-0.62
POSITIVE LOGITS
ï¸ı
1.08
Ô
1.03
âĪ
0.99
ðĿ
0.87
ãĥ©ãĥ³
0.86
Ì
0.85
£
0.85
Æ
0.85
Ëľ
0.84
в
0.84
Activations Density 0.047%