INDEX
Explanations
instances where the word "you" is used with a high emphasis or importance
references to the second person in rhetorical questions
New Auto-Interp
Negative Logits
inki
-0.66
furt
-0.64
hurst
-0.63
ifacts
-0.62
Net
-0.61
icio
-0.61
NOT
-0.60
Vers
-0.59
ington
-0.58
gypt
-0.57
POSITIVE LOGITS
anymore
0.82
kered
0.77
afe
0.68
dfx
0.65
ashamed
0.65
whatsoever
0.65
pity
0.63
appe
0.62
dare
0.62
?!
0.60
Activations Density 0.079%