INDEX
Explanations
references to illicit drugs and related criminal activities
New Auto-Interp
Negative Logits
ftagPool
-0.60
ArrowToggle
-0.54
utafitiHapana
-0.52
morire
-0.52
oftware
-0.50
typeorm
-0.48
înd
-0.48
ibin
-0.47
wahati
-0.47
bygger
-0.46
POSITIVE LOGITS
stolen
0.94
donated
0.92
belonging
0.87
confiscated
0.86
valued
0.80
recovered
0.79
gifted
0.77
belonged
0.76
smuggled
0.74
purchased
0.72
Activations Density 0.452%