INDEX
Explanations
references to situations where someone's life is in danger or threatened
references to life-threatening situations or the concept of life
New Auto-Interp
Negative Logits
Celest
-0.77
ERC
-0.72
Compliance
-0.71
Hyper
-0.69
channelAvailability
-0.68
xual
-0.66
CES
-0.64
propri
-0.64
lav
-0.64
é¾įåĸļ士
-0.64
POSITIVE LOGITS
guards
1.14
boats
1.10
boat
1.02
killers
0.96
mares
0.95
ously
0.89
guard
0.89
lihood
0.88
ffen
0.87
slain
0.85
Activations Density 0.053%