INDEX
Explanations
words that represent various script or language characters, particularly in non-Latin scripts
New Auto-Interp
Negative Logits
```
-0.55
injust
-0.54
nutz
-0.53
intre
-0.53
Gabel
-0.53
ում
-0.53
geste
-0.52
lenker
-0.52
for
-0.52
icher
-0.52
POSITIVE LOGITS
تانيه
0.86
NewLabel
0.84
ghijklmnop
0.83
Мексичка
0.79
enumii
0.73
NUMX
0.72
O
0.70
holdet
0.69
DoubleQuotes
0.68
wapV
0.67
Activations Density 0.132%