INDEX
Explanations
pronouns and verbs referring to oneself
references to self-related actions or events
New Auto-Interp
Negative Logits
heny
-0.80
veyard
-0.74
rought
-0.73
vertisements
-0.71
microsoft
-0.70
grain
-0.70
profits
-0.70
revolutions
-0.68
amazon
-0.68
sweet
-0.67
POSITIVE LOGITS
personally
0.73
himself
0.73
profess
0.71
underwater
0.71
confessed
0.70
mate
0.69
submar
0.66
unworthy
0.65
remorse
0.65
onstage
0.64
Activations Density 0.029%