INDEX
Explanations
instances of dialogue or quotations
New Auto-Interp
Negative Logits
dür
-0.16
igne
-0.15
orrent
-0.15
άκ
-0.15
elson
-0.14
sed
-0.14
uring
-0.14
ronics
-0.13
commercial
-0.13
work
-0.13
POSITIVE LOGITS
lya
0.16
emez
0.15
iller
0.15
Katz
0.15
lush
0.15
_tc
0.14
Gram
0.14
翼
0.14
/widgets
0.14
linger
0.14
Activations Density 0.122%