INDEX
Explanations
references to social media actions, specifically related to sharing content
instances of the word "Share"
New Auto-Interp
Negative Logits
schild
-0.81
panic
-0.75
destro
-0.74
sbm
-0.73
compr
-0.73
undai
-0.72
morrow
-0.71
©¶æ
-0.68
bley
-0.68
anwhile
-0.68
POSITIVE LOGITS
holders
1.22
holder
1.16
ership
0.99
ables
0.90
Share
0.87
cro
0.83
able
0.83
Tweet
0.82
holding
0.81
edin
0.79
Activations Density 0.017%