INDEX
Explanations
phrases that express a desire or intention to obtain something
New Auto-Interp
Negative Logits
\<^
-0.17
andel
-0.16
oka
-0.15
uary
-0.15
immel
-0.15
ply
-0.15
atics
-0.15
laden
-0.14
stk
-0.14
ars
-0.14
POSITIVE LOGITS
ekl
0.15
Morr
0.15
anos
0.14
άζ
0.14
EMA
0.14
entes
0.14
éĸ¢
0.14
Wilkinson
0.14
untime
0.13
ryn
0.13
Activations Density 0.084%