INDEX
Explanations
language related to subtlety and complexity in communication, emphasizing authenticity and avoiding over-simplification
New Auto-Interp
Negative Logits
GX
-0.16
aqu
-0.15
è«
-0.15
ysl
-0.14
Aqu
-0.14
blas
-0.14
hr
-0.14
etik
-0.14
oca
-0.14
oro
-0.14
POSITIVE LOGITS
ç´
0.17
outers
0.14
Robot
0.14
avel
0.14
Masc
0.14
teg
0.14
ILON
0.14
Spor
0.14
noun
0.14
ledge
0.14
Activations Density 0.322%