INDEX
Explanations
mentions of the social media platform Twitter
mentions of Twitter and its related functionalities
New Auto-Interp
Negative Logits
cision
-0.70
cised
-0.69
Starr
-0.67
Scandinavian
-0.67
olphin
-0.66
Vie
-0.66
minded
-0.64
istries
-0.64
Dahl
-0.63
Samoa
-0.62
POSITIVE LOGITS
storms
0.90
hashtag
0.89
hasht
0.88
@@@@@@@@
0.82
storm
0.79
feeds
0.78
handles
0.77
TW
0.76
(@
0.75
users
0.74
Activations Density 0.034%