INDEX
Explanations
phrases related to movement and location
New Auto-Interp
Negative Logits
Valid
-0.66
pse
-0.63
Success
-0.63
Failure
-0.61
cracked
-0.60
ottest
-0.59
Breach
-0.58
opus
-0.58
ieu
-0.57
kov
-0.57
POSITIVE LOGITS
cursor
1.05
closer
1.04
needle
0.93
focus
0.89
forward
0.89
toward
0.87
hither
0.83
orts
0.83
izont
0.81
horizontally
0.80
Activations Density 0.159%