INDEX
Explanations
references to murder and violent acts
New Auto-Interp
Negative Logits
éal
-0.18
itech
-0.17
IFY
-0.16
Ships
-0.14
kalp
-0.14
endance
-0.14
大å°ı
-0.14
mnt
-0.14
ships
-0.14
ify
-0.14
POSITIVE LOGITS
etry
0.15
erson
0.15
Highlands
0.15
cade
0.14
Gord
0.14
Highlander
0.14
stump
0.14
revenge
0.14
azzi
0.13
ognito
0.13
Activations Density 0.083%