INDEX
Explanations
special characters and punctuation marks
New Auto-Interp
Negative Logits
PWN
-0.96
myſelf
-0.91
كومونز
-0.91
laun
-0.84
)+"
-0.82
Brenn
-0.81
fevere
-0.81
دانشنامهٔ
-0.80
Efq
-0.78
Jamestown
-0.78
POSITIVE LOGITS
いる
1.01
Humphries
0.97
:
0.94
Rosenthal
0.89
iwa
0.88
::::::::
0.87
Ayres
0.86
sertation
0.81
허
0.79
;
0.79
Activations Density 0.157%