INDEX
Explanations
references to year-end summaries and recommendations
New Auto-Interp
Negative Logits
assen
-0.17
idel
-0.16
olie
-0.16
genders
-0.15
uta
-0.15
instant
-0.15
avic
-0.15
olan
-0.14
Hide
-0.14
oka
-0.14
POSITIVE LOGITS
pio
0.16
æĸ
0.15
thin
0.14
cion
0.14
ning
0.14
nothrow
0.14
Intermediate
0.14
intermediate
0.14
mans
0.14
lue
0.13
Activations Density 0.046%