INDEX
Explanations
injuries
incidents involving criminal activities or arrests.
The neuron activates on words that report bodily harm or accident outcomes—especially terms like “injuries,” “injured,” and “casualties.”
New Auto-Interp
Negative Logits
.identity
-0.07
ilenames
-0.07
Chỉ
-0.06
(Throwable
-0.06
、それ
-0.06
・━・━
-0.06
知
-0.06
customerId
-0.06
ปฏ
-0.06
Peaks
-0.06
POSITIVE LOGITS
taxes
0.07
::_
0.07
Zero
0.07
yms
0.06
intern
0.06
elow
0.06
allowance
0.06
Mag
0.06
reduction
0.06
methodologies
0.06
Activations Density 0.041%