INDEX
Explanations
Modal verbs indicating possibility or necessity
New Auto-Interp
Negative Logits
ilor
-0.16
arias
-0.15
urt
-0.15
alian
-0.14
ke
-0.14
ylie
-0.14
iris
-0.13
706
-0.13
alia
-0.13
yst
-0.13
POSITIVE LOGITS
ville
0.16
bach
0.16
ij
0.15
abolic
0.15
ase
0.15
vay
0.14
ê°¤
0.14
Coeff
0.14
zelf
0.14
Stamp
0.14
Activations Density 0.117%