INDEX
Explanations
verbs related to controlling or managing processes or organizations
actions related to control and leadership
New Auto-Interp
Negative Logits
Lenin
-0.70
repre
-0.61
alam
-0.60
henko
-0.60
Bron
-0.60
RELEASE
-0.60
Ë
-0.59
grave
-0.58
shapeshifter
-0.58
Render
-0.57
POSITIVE LOGITS
escape
1.00
af
0.98
aways
0.95
rampant
0.91
nin
0.82
ways
0.80
gs
0.79
simulations
0.77
smoothly
0.76
nell
0.75
Activations Density 0.034%