INDEX
Explanations
instances of violent incidents or criminal activities involving physical assault
New Auto-Interp
Negative Logits
ãĤ¦ãĤ¹
-0.86
trade
-0.78
profits
-0.78
inventoryQuantity
-0.75
manac
-0.74
Pand
-0.74
@#&
-0.73
politics
-0.71
Write
-0.69
correctness
-0.69
POSITIVE LOGITS
kneeling
1.16
grinning
1.06
smiling
1.03
filming
0.99
silhou
0.98
silhouette
0.97
footage
0.96
filmed
0.95
unidentified
0.95
crou
0.94
Activations Density 0.426%