INDEX
Explanations
terms related to illegal activities and legal charges
references to legal charges or criminal activities
New Auto-Interp
Negative Logits
hordes
-0.76
scenes
-0.74
igans
-0.69
sheets
-0.68
ants
-0.68
easy
-0.68
alot
-0.67
amazing
-0.67
flies
-0.66
awfully
-0.66
POSITIVE LOGITS
lawful
0.90
versive
0.89
licensee
0.88
valid
0.85
violation
0.84
prohibited
0.84
permissible
0.82
bona
0.80
dwelling
0.79
person
0.79
Activations Density 0.266%