INDEX
Explanations
phrases related to personal thoughts or opinions
references to the second person pronoun "you."
New Auto-Interp
Negative Logits
ensable
-0.65
ween
-0.62
ipal
-0.61
icy
-0.60
ector
-0.60
states
-0.60
Strikes
-0.58
;;;;;;;;;;;;
-0.57
Champ
-0.57
ilts
-0.56
POSITIVE LOGITS
're
1.33
tub
1.16
've
1.14
'll
1.07
guys
1.06
'd
0.90
guessed
0.88
hear
0.88
kai
0.86
yourselves
0.86
Activations Density 0.149%