INDEX
Explanations
statements related to politics and legal issues
statements or phrases related to accusations and legal actions
New Auto-Interp
Negative Logits
~/
-0.76
lease
-0.71
partName
-0.66
entary
-0.66
Pinball
-0.66
Monthly
-0.66
urry
-0.65
ilde
-0.64
ktop
-0.62
enhagen
-0.62
POSITIVE LOGITS
".[
0.84
terrorists
0.83
rapists
0.79
unfairly
0.78
terrorism
0.78
"
0.77
extremism
0.76
criminals
0.76
worse
0.75
"â̦
0.75
Activations Density 1.568%