INDEX
Explanations
references to financial penalties or fines
mentions of financial penalties and fines
New Auto-Interp
Negative Logits
Chest
-0.72
topic
-0.62
Path
-0.60
Path
-0.60
src
-0.59
path
-0.58
Ak
-0.58
View
-0.58
dev
-0.58
woke
-0.57
POSITIVE LOGITS
fines
3.78
fined
1.98
penalties
1.94
punishments
1.54
fine
1.42
bribes
1.42
suspensions
1.31
penalty
1.27
fine
1.23
subpoen
1.19
Activations Density 0.013%