INDEX
Explanations
references to scandals or controversial events
references to scandals
New Auto-Interp
Negative Logits
requisite
-0.74
icrobial
-0.72
yip
-0.72
ignt
-0.68
hetics
-0.68
ignty
-0.63
cki
-0.63
emonic
-0.62
erd
-0.61
berman
-0.60
POSITIVE LOGITS
scandals
1.00
revolving
0.95
scandal
0.93
ously
0.92
involving
0.91
icity
0.84
plag
0.83
ous
0.80
allegations
0.79
engulf
0.76
Activations Density 0.027%