INDEX
Explanations
mentions of celebrities
mentions and discussions of celebrities
New Auto-Interp
Negative Logits
choes
-0.87
¼
-0.84
anus
-0.83
hematic
-0.81
THER
-0.79
¾
-0.76
¸
-0.75
²¾
-0.73
Ģ
-0.73
tered
-0.71
POSITIVE LOGITS
rities
1.11
endorsements
1.06
endors
1.05
gossip
1.00
chef
0.96
chefs
0.87
nude
0.78
wcs
0.77
feud
0.77
idols
0.77
Activations Density 0.048%