INDEX
Explanations
sentences ending in a period
occurrences of sentences that express declining to provide comments
New Auto-Interp
Negative Logits
tremend
-1.01
bully
-0.92
imagination
-0.85
scenery
-0.81
unstoppable
-0.81
beaut
-0.81
revol
-0.81
amazing
-0.81
metic
-0.80
gorge
-0.80
POSITIVE LOGITS
Additionally
1.35
However
1.25
Previously
1.23
<|endoftext|>
1.20
Officials
1.16
According
1.16
Alternatively
1.14
Presumably
1.12
Afterwards
1.12
Notably
1.11
Activations Density 0.742%