INDEX
Explanations
objects or items being carried by individuals
instances of potential violations or illegal activities
New Auto-Interp
Negative Logits
xual
-0.80
osaurus
-0.77
ouble
-0.76
inois
-0.73
uto
-0.71
ername
-0.70
oreal
-0.69
XIII
-0.66
dit
-0.64
folio
-0.64
POSITIVE LOGITS
è¦ļéĨĴ
0.74
Jury
0.66
wash
0.61
SOURCE
0.61
³³³
0.60
accompan
0.60
·
0.59
³³
0.59
Said
0.56
jer
0.56
Activations Density 0.154%