INDEX
Explanations
periods and punctuation marks in the text
New Auto-Interp
Negative Logits
bersome
-0.17
agli
-0.14
oya
-0.14
wnd
-0.14
ocha
-0.14
sund
-0.14
stab
-0.14
meal
-0.14
ombo
-0.14
afs
-0.13
POSITIVE LOGITS
acon
0.15
دÙĩÙħ
0.15
証
0.15
Bust
0.14
kara
0.14
createView
0.14
SPELL
0.14
Ïģια
0.14
ajas
0.14
ettel
0.14
Activations Density 0.044%