INDEX
Explanations
references to theft and related criminal activities
New Auto-Interp
Negative Logits
oluble
-0.57
criterio
-0.50
partimento
-0.49
Medien
-0.47
onError
-0.47
MyApp
-0.47
iyaki
-0.46
renou
-0.46
IVersion
-0.45
mostrarse
-0.45
POSITIVE LOGITS
stealing
1.14
theft
1.13
steal
1.12
steals
1.11
stolen
1.02
theft
1.02
steal
1.02
stole
1.01
Theft
0.99
thief
0.99
Activations Density 0.535%