INDEX
Explanations
actions and processes related to taking initiative or making decisions
New Auto-Interp
Negative Logits
lis
-0.17
cant
-0.16
Amen
-0.15
alem
-0.15
arg
-0.15
pend
-0.14
rosse
-0.14
tre
-0.14
mel
-0.14
roi
-0.14
POSITIVE LOGITS
anut
0.15
uctor
0.14
wald
0.14
itational
0.14
oley
0.14
.xy
0.14
/goto
0.14
ethod
0.14
xy
0.13
nees
0.13
Activations Density 0.288%