INDEX
Explanations
prominent references to specific literary works
New Auto-Interp
Negative Logits
"
-0.26
("-0.20
"
-0.17
("-0.17
_
-0.17
`
-0.16
"(
-0.16
ÏģÏį
-0.16
"[
-0.16
,"
-0.16
POSITIVE LOGITS
Bog
0.17
Leone
0.17
promo
0.16
Noir
0.15
è·
0.14
today
0.14
aka
0.14
Tonight
0.14
ca
0.14
IGNED
0.14
Activations Density 0.000%