INDEX
Explanations
references to victims in various contexts, particularly related to crime and abuse
New Auto-Interp
Negative Logits
aber
-0.16
ald
-0.16
vid
-0.15
agg
-0.15
orus
-0.14
ider
-0.14
alm
-0.14
eg
-0.14
mates
-0.14
/video
-0.14
POSITIVE LOGITS
peon
0.18
hood
0.16
нÑĤ
0.15
versa
0.15
ivors
0.15
OLUME
0.15
nave
0.15
/mock
0.15
elist
0.15
dụng
0.14
Activations Density 0.019%