INDEX
Explanations
dates and specific events or actions
sentences that convey significant conclusions or observations
New Auto-Interp
Negative Logits
è¦ļéĨĴ
-0.62
»Ĵ
-0.59
yells
-0.58
Dialogue
-0.57
brawl
-0.57
grit
-0.57
chron
-0.55
ounding
-0.54
pg
-0.53
bodily
-0.53
POSITIVE LOGITS
However
1.06
But
1.04
but
1.01
However
1.01
But
0.99
Instead
0.93
Then
0.89
Instead
0.87
Now
0.86
until
0.86
Activations Density 0.887%