INDEX
Explanations
references to soap
references to soap operas
New Auto-Interp
Negative Logits
umbered
-0.73
etermined
-0.71
trl
-0.66
uding
-0.64
omission
-0.63
alez
-0.63
orders
-0.62
sworn
-0.60
CVE
-0.60
places
-0.60
POSITIVE LOGITS
opera
1.14
oper
1.09
soap
0.96
eless
0.87
wich
0.84
bowl
0.81
pering
0.78
herer
0.78
utical
0.75
Leone
0.75
Activations Density 0.009%