INDEX
Explanations
symbols indicating questions and emotional expressions
questions and inquiries throughout the text
New Auto-Interp
Negative Logits
eleph
-0.80
anwhile
-0.74
curves
-0.67
proport
-0.67
phased
-0.64
targets
-0.64
ngth
-0.64
sterdam
-0.63
aditional
-0.62
looph
-0.61
POSITIVE LOGITS
Answer
1.24
Absolutely
1.00
Well
0.96
Probably
0.90
³³³³
0.88
RH
0.86
Honestly
0.85
Yes
0.85
↵
0.84
Yep
0.82
Activations Density 0.110%