INDEX
Explanations
expressions related to progress or updates
the word "now" and the phrase "also," indicating a focus on current events or updates
New Auto-Interp
Negative Logits
Subject
-0.70
omn
-0.60
Rap
-0.58
Mats
-0.58
avier
-0.56
Behind
-0.55
harm
-0.55
etting
-0.54
onto
-0.53
disobedience
-0.53
POSITIVE LOGITS
been
1.55
been
1.32
undergone
1.09
gone
0.97
become
0.96
gotten
0.96
Been
0.95
gone
0.93
fallen
0.92
begun
0.90
Activations Density 0.151%