INDEX
Explanations
punctuation and indicators of dialogue
New Auto-Interp
Negative Logits
penetr
-0.15
tube
-0.15
kop
-0.14
kre
-0.14
cher
-0.14
ube
-0.14
rels
-0.14
tubes
-0.14
風
-0.14
lif
-0.13
POSITIVE LOGITS
è¡
0.17
Walters
0.16
740
0.15
/fast
0.15
conc
0.15
662
0.14
741
0.14
345
0.14
409
0.14
ipse
0.14
Activations Density 0.013%