INDEX
Explanations
references to fictional narratives and dramas involving characters and their relationships
New Auto-Interp
Negative Logits
EINA
-0.16
steder
-0.15
baugh
-0.15
cem
-0.14
otomy
-0.14
uhn
-0.14
zel
-0.14
mamak
-0.14
guarda
-0.14
buster
-0.13
POSITIVE LOGITS
736
0.17
Patch
0.14
735
0.14
ackers
0.14
inka
0.14
ewan
0.14
insky
0.14
733
0.14
Fiction
0.14
533
0.14
Activations Density 0.102%