INDEX
Explanations
critics or opponents of various plans, policies, or actions
references to individuals or groups who express opposition or criticism
New Auto-Interp
Negative Logits
Pradesh
-0.70
carnage
-0.65
Anniversary
-0.65
owship
-0.63
ingestion
-0.62
sunrise
-0.61
sunset
-0.58
bilateral
-0.58
ibrary
-0.57
UCK
-0.57
POSITIVE LOGITS
paces
1.19
hip
1.06
oft
0.98
hesis
0.98
thereof
0.97
ervatives
0.93
hips
0.89
pace
0.87
argue
0.85
alike
0.84
Activations Density 0.107%