INDEX
Explanations
expressions or descriptions indicating something is simplistic or lacking in complexity
New Auto-Interp
Negative Logits
brows
-0.15
atre
-0.14
aclass
-0.14
uling
-0.14
blind
-0.14
byss
-0.14
chef
-0.14
Bib
-0.14
mania
-0.14
ën
-0.13
POSITIVE LOGITS
ings
0.19
ãĢħ
0.17
/plain
0.17
IDER
0.16
é¾
0.16
clo
0.16
миÑĤ
0.16
-ÑĤаки
0.16
chant
0.15
à¹Ĩ
0.15
Activations Density 0.008%