INDEX
Explanations
Twitter usernames preceded by an "@" symbol
mentions of social media handles or usernames
New Auto-Interp
Negative Logits
orche
-0.71
agra
-0.70
rust
-0.69
elect
-0.69
hust
-0.68
immersion
-0.67
methamphetamine
-0.67
royalty
-0.63
bestos
-0.62
Malk
-0.62
POSITIVE LOGITS
(@
1.20
#$
0.99
username
0.78
SourceFile
0.77
realDonaldTrump
0.77
sorry
0.75
deck
0.75
updated
0.75
NB
0.74
hello
0.74
Activations Density 0.019%