INDEX
Explanations
the words associated with the term "Dead" with varying degrees of intensity in activation
references to "Dead" associated with various contexts, particularly in titles and themes
New Auto-Interp
Negative Logits
ĸļ
-0.90
anwhile
-0.81
orney
-0.78
ickr
-0.76
efully
-0.73
uration
-0.68
unity
-0.66
REPL
-0.66
FFFF
-0.65
aunder
-0.65
POSITIVE LOGITS
Dead
1.24
Dead
1.00
Alive
0.91
DEAD
0.88
dead
0.87
spin
0.85
pool
0.85
liest
0.84
Corpse
0.82
bag
0.81
Activations Density 0.009%