INDEX
Explanations
information related to incidents, accidents, and legal situations involving individuals
New Auto-Interp
Negative Logits
ivals
-0.85
phabet
-0.80
comparisons
-0.75
doms
-0.72
rities
-0.71
scenarios
-0.69
narratives
-0.69
ividual
-0.69
venge
-0.68
lifestyles
-0.68
POSITIVE LOGITS
malfunction
1.38
jammed
1.11
worn
1.11
cracked
1.04
equipped
1.01
broken
1.00
rigged
1.00
confiscated
0.99
activated
0.99
repaired
0.98
Activations Density 0.342%