INDEX
Explanations
Twitter handles of specific individuals
names of individuals and entities related to social media accounts
New Auto-Interp
Negative Logits
etheless
-0.85
Ͻ
-0.72
bably
-0.68
gged
-0.63
itutes
-0.63
bidden
-0.62
grop
-0.61
dstg
-0.61
irreversible
-0.61
ledged
-0.60
POSITIVE LOGITS
!:
0.79
!'
0.76
(@
0.74
homepage
0.72
!
0.70
pedia
0.70
»
0.68
Quotes
0.63
HERE
0.63
FAQ
0.62
Activations Density 0.694%