INDEX
Explanations
references to physical harm or damage
harm or damage
New Auto-Interp
Negative Logits
➊
-0.50
BeforeClass
-0.49
îns
-0.48
libremente
-0.48
SpringRunner
-0.47
}{*}{-0.47
nikom
-0.46
rolid
-0.46
thoven
-0.45
Vezi
-0.44
POSITIVE LOGITS
Damage
1.29
damage
1.20
Damage
1.20
DAMAGE
1.16
damage
1.15
Damaged
1.02
DAMAGE
1.00
damaged
0.98
damaged
0.96
Damages
0.95
Activations Density 0.018%