INDEX
Explanations
phrases encouraging action or pursuit of goals
New Auto-Interp
Negative Logits
trap
-0.16
indow
-0.15
sworth
-0.15
peon
-0.14
Trap
-0.13
enus
-0.13
.party
-0.13
дам
-0.13
isFunction
-0.13
ugh
-0.13
POSITIVE LOGITS
ixin
0.15
csi
0.14
ÃĸL
0.14
olle
0.13
347
0.13
iva
0.13
Machinery
0.13
Fet
0.13
_utf
0.13
253
0.13
Activations Density 0.014%