INDEX
Explanations
mentions of death or related terms
mentions of the word "death."
New Auto-Interp
Negative Logits
Available
-0.79
Cola
-0.74
ĸļ
-0.74
Ĭ±
-0.74
Alpha
-0.71
Avg
-0.70
Available
-0.68
OPER
-0.67
ij
-0.67
Collider
-0.67
POSITIVE LOGITS
blow
0.89
stroke
0.87
death
0.86
touch
0.86
anguage
0.86
guards
0.84
death
0.83
hound
0.83
match
0.82
fighter
0.81
Activations Density 0.021%