INDEX
Explanations
phrases where someone is speaking or quoting someone else
quotations or dialogue in the text
New Auto-Interp
Negative Logits
princip
-0.87
deterrent
-0.85
adv
-0.79
frontline
-0.75
stem
-0.75
hardened
-0.74
conve
-0.73
fluct
-0.73
vigil
-0.73
proceeds
-0.72
POSITIVE LOGITS
Hey
1.74
Oh
1.62
hey
1.52
Yeah
1.51
Look
1.47
Fuck
1.44
Okay
1.43
Well
1.42
Damn
1.40
yeah
1.38
Activations Density 0.059%