INDEX
Explanations
phrases related to directions or instructions
common pronouns and their associated verbs in context
New Auto-Interp
Negative Logits
rush
-0.65
inav
-0.62
ensu
-0.60
"}],"
-0.57
\":
-0.56
Preview
-0.56
Danish
-0.56
Lobby
-0.55
Scand
-0.55
Adv
-0.55
POSITIVE LOGITS
mere
0.93
merely
0.92
ifle
0.80
Instead
0.79
ngth
0.77
simply
0.75
Berry
0.75
Nor
0.74
erville
0.71
gemony
0.71
Activations Density 0.530%