INDEX
Explanations
words that express positive sentiment or are related to embellishment/decoration
New Auto-Interp
Negative Logits
-0.95
<eos>
-0.89
on
-0.87
,
-0.84
a
-0.84
in
-0.83
M
-0.83
I
-0.83
P
-0.82
K
-0.79
POSITIVE LOGITS
myſelf
1.91
itſelf
1.87
Monfieur
1.84
Theſe
1.77
pleaſure
1.75
Anſ
1.74
Majefty
1.74
Jefus
1.74
purpoſe
1.73
་་
1.66
Activations Density 1.806%