INDEX
Explanations
instances where a character is talking or thoughts are revealed
end punctuation marks in sentences
New Auto-Interp
Negative Logits
tremend
-0.79
favors
-0.66
subsidized
-0.66
transact
-0.66
modernization
-0.64
abund
-0.63
prioritize
-0.61
centralized
-0.61
biases
-0.60
incentives
-0.60
POSITIVE LOGITS
↵
1.12
His
1.00
He
1.00
Pict
0.90
<|endoftext|>
0.88
↵↵
0.88
His
0.82
Afterwards
0.82
Picture
0.81
He
0.78
Activations Density 0.288%