INDEX
Explanations
social media interaction phrases and symbols related to sharing content
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.07
3:0.05
4:0.08
5:0.04
6:0.36
7:0.10
8:0.03
9:0.05
10:0.10
11:0.06
Negative Logits
wagen
-2.19
��
-1.58
tain
-1.48
��
-1.47
�
-1.47
paio
-1.40
�
-1.38
��
-1.36
��
-1.32
payday
-1.24
POSITIVE LOGITS
together
1.58
ickets
1.56
coins
1.47
earances
1.45
inctions
1.45
sexes
1.36
</
1.33
styles
1.33
Shared
1.31
initialized
1.31
Activations Density 0.008%