INDEX
Explanations
specific entities mentioned in news articles or journalistic pieces
references to media, particularly journalism, performances, and publications
New Auto-Interp
Negative Logits
izabeth
-0.70
EVs
-0.69
ascript
-0.66
uture
-0.66
defin
-0.63
fiat
-0.63
urion
-0.62
+++
-0.62
reditary
-0.62
ureau
-0.61
POSITIVE LOGITS
boycott
0.75
pornographic
0.73
jihad
0.71
copyrighted
0.66
exhib
0.65
nightclub
0.65
cigarettes
0.65
Club
0.64
ridic
0.64
boycot
0.64
Activations Density 0.780%