INDEX
Explanations
phrases and prepositions that indicate relationships or connections between concepts
New Auto-Interp
Negative Logits
jos
-0.17
amas
-0.15
GURL
-0.15
nackte
-0.14
otime
-0.14
Král
-0.14
ANNEL
-0.13
íĥ
-0.13
FRING
-0.13
usercontent
-0.13
POSITIVE LOGITS
various
0.30
both
0.25
several
0.21
Various
0.20
åIJĦç§į
0.20
both
0.18
different
0.18
både
0.18
Various
0.18
actual
0.17
Activations Density 0.042%