INDEX
Explanations
mentions of death and murder
New Auto-Interp
Negative Logits
enko
-0.19
enties
-0.15
iday
-0.15
igan
-0.15
untime
-0.15
sti
-0.14
ihad
-0.14
ason
-0.14
alue
-0.14
hiá»ĩn
-0.14
POSITIVE LOGITS
innocent
0.20
innoc
0.17
innocence
0.17
екÑĤи
0.16
icon
0.16
journal
0.15
zet
0.15
Explorer
0.15
members
0.15
unarmed
0.15
Activations Density 0.088%