INDEX
Explanations
elements related to positive book reviews and character development
New Auto-Interp
Negative Logits
ÅĻi
-0.15
Fuk
-0.15
\:
-0.15
soru
-0.14
idth
-0.14
ä
-0.14
Official
-0.14
umont
-0.14
ensus
-0.14
Mans
-0.14
POSITIVE LOGITS
åħ
0.15
lev
0.14
eview
0.14
ÏģιÏĥ
0.13
oose
0.13
ector
0.13
.shiro
0.13
зн
0.13
Lit
0.13
wner
0.13
Activations Density 0.076%