INDEX
Explanations
phrases indicating warnings or disclaimers regarding sensitive content
New Auto-Interp
Negative Logits
ptid
-0.47
(;;)
-0.45
IDENTITY
-0.45
::$_
-0.44
LOV
-0.43
endgroup
-0.42
Scalars
-0.42
aryn
-0.41
graphql
-0.41
[+
-0.41
POSITIVE LOGITS
chapter
0.86
chapters
0.85
Chapter
0.81
author
0.78
chapitre
0.78
Chapitre
0.78
story
0.77
fanfiction
0.76
fanfic
0.76
Chapter
0.75
Activations Density 0.063%