INDEX
Explanations
words related to improvement, progress, or motivation
special characters, particularly the character "Â" which appears prominently in the text
New Auto-Interp
Negative Logits
enburg
-0.78
enegger
-0.77
ãģ®éŃĶ
-0.68
Athena
-0.65
iary
-0.65
iant
-0.62
Atom
-0.62
iard
-0.61
otrop
-0.60
Druid
-0.60
POSITIVE LOGITS
Â
1.38
¹
1.30
¬
1.15
¼
1.15
¥
1.14
¤
1.11
ª
1.09
¿
1.09
¸
1.04
³
1.04
Activations Density 0.010%