INDEX
Explanations
social media platform names specifically Facebook
references to Facebook
New Auto-Interp
Negative Logits
bilt
-0.74
rals
-0.73
plane
-0.71
ACTED
-0.68
ppa
-0.66
tti
-0.63
akin
-0.63
rans
-0.62
gran
-0.62
stood
-0.62
POSITIVE LOGITS
1.10
0.92
0.85
Comments
0.85
Tweet
0.82
Tumblr
0.79
0.76
0.76
Likes
0.76
0.73
Activations Density 0.040%