INDEX
Explanations
phrases related to sharing updates and commentary in a casual style
statements reflecting self-awareness and personal opinions
New Auto-Interp
Negative Logits
).[
-0.65
."[
-0.60
anwhile
-0.60
therefore
-0.58
]."
-0.58
.'"
-0.55
)."
-0.55
'."
-0.53
".[
-0.52
.""
-0.51
POSITIVE LOGITS
Spoiler
0.63
spoilers
0.60
Spoiler
0.59
FANTASY
0.57
lovely
0.57
Patreon
0.57
spoiler
0.56
COMPLE
0.53
REALLY
0.53
Reviewer
0.52
Activations Density 3.015%