INDEX
Explanations
references to penalties or fines
New Auto-Interp
Negative Logits
erial
-0.81
rina
-0.78
atters
-0.77
ergy
-0.74
ullivan
-0.73
bits
-0.73
ãĤ¤ãĥĪ
-0.72
growth
-0.72
oscope
-0.72
Remastered
-0.72
POSITIVE LOGITS
penalties
1.20
levied
1.13
imposed
1.10
penalty
1.08
sanction
0.99
incurred
0.95
inflicted
0.92
punishments
0.88
fines
0.87
punishment
0.87
Activations Density 0.010%