INDEX
Explanations
mentions of specific individuals or groups of people
references to people involved in crime or incidents
New Auto-Interp
Negative Logits
raltar
-0.66
aeda
-0.64
ä¸Ĭ
-0.64
actionDate
-0.64
Future
-0.61
united
-0.61
equality
-0.60
Lens
-0.59
moss
-0.59
blight
-0.59
POSITIVE LOGITS
testified
0.88
complained
0.84
fled
0.83
's
0.83
pleaded
0.83
withdrew
0.81
thanked
0.80
apologized
0.80
reportedly
0.79
hari
0.79
Activations Density 0.223%