INDEX
Explanations
proper nouns, likely related to news headlines or public figures
references to public figures and incidents involving controversy or conflict
New Auto-Interp
Negative Logits
probabilities
-0.78
Completed
-0.77
"],"
-0.76
chance
-0.73
feasibility
-0.69
aceutical
-0.69
pletion
-0.69
Prediction
-0.69
Farming
-0.69
completion
-0.68
POSITIVE LOGITS
insulted
1.55
offended
1.47
lashed
1.34
clashed
1.33
harassed
1.31
humiliated
1.28
hurled
1.25
criticised
1.24
angered
1.24
insults
1.23
Activations Density 0.681%