INDEX
Explanations
punctuation and formatting symbols indicating tone or emotional expression
Punctuation and symbols
sentence endings
New Auto-Interp
Negative Logits
!")
-1.02
}))
-0.92
doubtnut
-0.88
.")
-0.87
Jefus
-0.87
ſelf
-0.86
pleaſure
-0.85
-0.85
ſelves
-0.83
raiſ
-0.82
POSITIVE LOGITS
[
0.74
0.67
-
0.65
--
0.59
↵
0.58
(
0.58
:
0.56
|
0.56
$
0.56
it
0.55
Activations Density 0.433%