INDEX
Explanations
sentences related to personal opinions or reflections
New Auto-Interp
Negative Logits
veil
-0.67
xtap
-0.65
WATCHED
-0.65
sugg
-0.64
plea
-0.63
rush
-0.63
drop
-0.63
estone
-0.63
tip
-0.61
Previously
-0.60
POSITIVE LOGITS
theirs
1.30
hers
1.18
yours
1.16
ours
1.12
anybody
0.88
mine
0.87
everybody
0.82
others
0.81
everyone
0.81
whoever
0.80
Activations Density 0.286%