INDEX
Explanations
references to violence and captivity in a historical or narrative context
New Auto-Interp
Negative Logits
pun
-0.16
uppe
-0.15
zem
-0.15
ắm
-0.14
erg
-0.14
spat
-0.14
onis
-0.14
itom
-0.14
Olympus
-0.14
trip
-0.14
POSITIVE LOGITS
hawks
0.16
apult
0.15
resil
0.15
iska
0.14
libertin
0.14
elder
0.13
ulence
0.13
ldr
0.13
Toolbox
0.13
trụ
0.13
Activations Density 0.091%