INDEX
Explanations
references to vampire films and their narratives
New Auto-Interp
Negative Logits
cono
-0.16
shal
-0.16
ë©ĺ
-0.15
ustum
-0.15
uales
-0.14
endregion
-0.14
entar
-0.14
motiv
-0.14
leads
-0.13
æĶ¯
-0.13
POSITIVE LOGITS
es
0.20
differs
0.19
promised
0.19
accompl
0.18
pig
0.18
felt
0.17
diver
0.17
pig
0.17
certainly
0.17
attempted
0.17
Activations Density 0.448%