INDEX
Explanations
instances of dialogue and quotations
New Auto-Interp
Negative Logits
.synthetic
-0.18
ertino
-0.16
екÑĤ
-0.15
eil
-0.14
dag
-0.14
.sl
-0.14
iets
-0.14
ignon
-0.14
icaret
-0.13
Dar
-0.13
POSITIVE LOGITS
illator
0.18
ame
0.15
pdev
0.15
relev
0.14
myself
0.14
avern
0.14
ashes
0.14
imd
0.14
zych
0.14
mash
0.14
Activations Density 0.227%