INDEX
Explanations
instances of emotional expressions and personal experiences shared on social media
New Auto-Interp
Negative Logits
annon
-0.19
mans
-0.16
vern
-0.15
nan
-0.15
iversit
-0.14
åĬª
-0.14
peare
-0.13
ÐĵÐŀ
-0.13
Ñīе
-0.13
entic
-0.13
POSITIVE LOGITS
shown
0.23
seen
0.21
видно
0.19
audio
0.19
can
0.19
clearly
0.19
visible
0.18
clip
0.18
footage
0.18
appears
0.17
Activations Density 0.061%