INDEX
Explanations
references to writers and directors in the context of their works
New Auto-Interp
Negative Logits
Cinder
-0.16
оÑĢе
-0.16
Gaw
-0.15
егоÑĢ
-0.15
иÑģÑĤÑĢа
-0.14
qml
-0.14
Beard
-0.14
TestCategory
-0.14
sst
-0.14
hop
-0.14
POSITIVE LOGITS
Buffy
0.36
BUFF
0.29
BUFF
0.27
Giles
0.24
_BUFF
0.23
Buff
0.23
buff
0.22
Slayer
0.22
buff
0.22
Vampire
0.19
Activations Density 0.014%