INDEX
Explanations
occurrences of the word "murder" or mention of specific criminal cases
New Auto-Interp
Negative Logits
featureID
-1.06
RenderAtEndOf
-0.96
ьаж
-0.92
省市镇
-0.90
ویکیپدی
-0.89
bootstrapcdn
-0.88
PreferredItem
-0.87
betweenstory
-0.85
kháu
-0.84
enterOuterAlt
-0.83
POSITIVE LOGITS
<tr>
1.13
<td>
0.59
الحره
0.58
<th>
0.54
го
0.51
c
0.49
<blockquote>
0.48
x
0.48
c
0.48
相
0.47
Activations Density 0.019%