INDEX
Explanations
measures or values related to a numerical scale
New Auto-Interp
Negative Logits
vw
-0.15
áÄį
-0.15
Alman
-0.14
åĬª
-0.14
orian
-0.14
reo
-0.14
Rum
-0.13
874
-0.13
FOREIGN
-0.13
lemen
-0.13
POSITIVE LOGITS
ronic
0.15
ataka
0.15
blr
0.14
ailability
0.14
soever
0.14
arih
0.14
Unsafe
0.14
imest
0.14
sink
0.14
ensis
0.14
Activations Density 0.009%