INDEX
Explanations
personal pronouns followed by verbs or observations, particularly related to social media and politics
repeated pronouns and expressions of knowledge or belief
New Auto-Interp
Negative Logits
disappearing
-0.65
Restoration
-0.64
Effective
-0.63
Blessing
-0.61
Reborn
-0.60
Plum
-0.58
forms
-0.58
Delicious
-0.57
601
-0.56
Refuge
-0.56
POSITIVE LOGITS
wonder
1.02
wish
0.96
hear
0.96
sympath
0.95
understand
0.95
know
0.95
imag
0.92
feel
0.91
empath
0.90
realise
0.90
Activations Density 0.415%