INDEX
Explanations
quotes or direct speech in the form of a conversation
quotation marks used in dialogue
New Auto-Interp
Negative Logits
pole
-0.85
adjud
-0.78
valued
-0.76
characterized
-0.75
accident
-0.74
seasoned
-0.73
targeted
-0.73
foreseeable
-0.73
innov
-0.73
previously
-0.72
POSITIVE LOGITS
Yeah
1.81
Oh
1.70
Huh
1.70
Hmm
1.67
Uh
1.67
Yes
1.65
Alright
1.64
Okay
1.64
Hey
1.63
Fuck
1.57
Activations Density 0.101%