INDEX
Explanations
expressions of desire or intent related to actions and goals
New Auto-Interp
Negative Logits
akis
-0.19
Uns
-0.16
_HI
-0.16
cere
-0.14
uns
-0.14
ëŀij
-0.14
æĴ¤
-0.14
Sphere
-0.14
odos
-0.14
hen
-0.13
POSITIVE LOGITS
asje
0.16
zept
0.16
stasy
0.15
StateManager
0.15
ÙĦÙĬÙĩ
0.14
olk
0.14
_navigation
0.14
Composite
0.14
660
0.14
Composite
0.14
Activations Density 0.005%