INDEX
Explanations
proper nouns and specific entities
Uncommon capitalization/punctuation
HoJos across
New Auto-Interp
Negative Logits
↵
-0.46
↵↵
-0.38
-0.33
-0.31
s
-0.30
_
-0.27
ins
-0.27
#
-0.26
Ins
-0.26
ần
-0.26
POSITIVE LOGITS
pleaſure
0.88
itſelf
0.84
queſta
0.82
myſelf
0.78
ロウィン
0.78
Geiſt
0.77
queſto
0.77
Monfieur
0.76
fashiola
0.76
faſt
0.75
Activations Density 1.421%