INDEX
Explanations
questions or prompts directed at someone and often involving decision-making
the use of the word "you" in various contexts
New Auto-Interp
Negative Logits
ruciating
-0.77
icent
-0.71
ges
-0.69
Pg
-0.66
Canaver
-0.65
ipal
-0.64
Roose
-0.62
photos
-0.62
Kat
-0.62
peed
-0.62
POSITIVE LOGITS
guys
1.24
're
1.12
tub
1.03
want
0.94
've
0.94
know
0.93
wanna
0.91
think
0.90
intend
0.90
sir
0.87
Activations Density 0.080%