INDEX
Explanations
references to illegal activities or objects
instances of illegal activities or practices
New Auto-Interp
Negative Logits
oleon
-0.80
nil
-0.77
ĸļ
-0.75
addons
-0.75
ieu
-0.74
antics
-0.73
roth
-0.73
vation
-0.72
alg
-0.70
erer
-0.70
POSITIVE LOGITS
detained
0.85
downloaded
0.82
downloading
0.81
illegally
0.81
copied
0.78
infringing
0.78
intercepted
0.77
planted
0.75
parked
0.74
infring
0.72
Activations Density 0.013%