INDEX
Explanations
phrases where someone is speaking or expressing their opinion
phrases or sentences including dialogue and responses
New Auto-Interp
Negative Logits
vable
-0.77
ords
-0.76
rica
-0.74
rawdownloadcloneembedreportprint
-0.70
iffe
-0.69
Flavoring
-0.67
tnc
-0.67
APTER
-0.65
imore
-0.64
20439
-0.64
POSITIVE LOGITS
hey
1.32
'
1.27
'[
1.26
`
1.25
"'
1.23
oh
1.11
"
1.08
wow
1.08
Hey
1.06
'(
1.05
Activations Density 0.082%