INDEX
Explanations
phrases that indicate a change in focus or direction
the word "away" in various contexts of shifting focus or direction
New Auto-Interp
Negative Logits
ammy
-0.76
milo
-0.73
turnover
-0.69
ingham
-0.66
Organ
-0.65
Beans
-0.63
iosyn
-0.63
tone
-0.63
chang
-0.62
beans
-0.62
POSITIVE LOGITS
toward
0.78
towards
0.78
finder
0.70
posts
0.69
RAG
0.68
away
0.67
ente
0.67
from
0.67
awei
0.66
rov
0.66
Activations Density 0.031%