INDEX
Explanations
mentions of companies and brand names
references to specific individuals or organizations in the context of activism or discrimination
New Auto-Interp
Negative Logits
earchers
-0.80
©¶æ
-0.80
blance
-0.75
dime
-0.72
ĸļ
-0.71
ources
-0.68
enthal
-0.67
atives
-0.66
ommel
-0.66
hend
-0.64
POSITIVE LOGITS
iar
0.72
ãģĤ
0.63
Cola
0.63
ा
0.62
iasco
0.61
knees
0.60
Playoff
0.59
à¤
0.58
bananas
0.57
ãĤ¬
0.57
Activations Density 0.364%