INDEX
Explanations
Twitter handles
Twitter handles or mentions
New Auto-Interp
Negative Logits
Bradford
-0.74
specificity
-0.72
Norwich
-0.69
Pyramid
-0.69
Leone
-0.66
ordinance
-0.65
choir
-0.65
lipstick
-0.65
Romanian
-0.65
tracts
-0.64
POSITIVE LOGITS
realDonaldTrump
1.07
#$
0.94
@@@@@@@@
0.94
nat
0.89
hidden
0.87
username
0.86
groups
0.85
ANI
0.83
thereal
0.83
dot
0.82
Activations Density 0.024%