INDEX
Explanations
verbs indicating a direction or an outcome, particularly in the context of decision-making or progression
New Auto-Interp
Negative Logits
fony
-0.17
EDI
-0.16
uw
-0.15
OnClick
-0.14
imeo
-0.14
aser
-0.14
ney
-0.14
arie
-0.14
use
-0.14
lep
-0.14
POSITIVE LOGITS
625
0.18
gers
0.16
aina
0.15
hunter
0.14
OSH
0.14
haft
0.14
LAY
0.14
us
0.14
leading
0.14
ç¥
0.14
Activations Density 0.027%