INDEX
Explanations
references to political figures or activities
proper nouns, especially names and titles of individuals
New Auto-Interp
Negative Logits
lihood
-0.83
CONCLUS
-0.76
/>
-0.72
RESULTS
-0.72
thumbnails
-0.70
vg
-0.69
Shutterstock
-0.69
MU
-0.67
Leilan
-0.66
ilial
-0.66
POSITIVE LOGITS
reetings
0.73
baseman
0.70
whisky
0.65
hester
0.62
frontman
0.60
achu
0.60
dodged
0.60
embattled
0.60
footballer
0.58
awoke
0.58
Activations Density 0.378%