INDEX
Explanations
action-related phrases with a clear purpose
phrases expressing intentions, aims, hopes, or goals
New Auto-Interp
Negative Logits
avorite
-0.86
nodd
-0.73
elta
-0.70
imore
-0.64
ocket
-0.64
aukee
-0.59
zek
-0.59
batted
-0.58
heed
-0.57
oggles
-0.57
POSITIVE LOGITS
UTH
0.74
guyen
0.69
à¨
0.66
ĺħ
0.66
onym
0.66
caveat
0.63
ims
0.62
implication
0.60
either
0.59
pretext
0.59
Activations Density 0.097%