INDEX
Explanations
phrases and questions that express degrees of awareness or understanding about a situation or subject
New Auto-Interp
Negative Logits
pt
-0.15
plus
-0.14
@a
-0.14
inning
-0.14
Eins
-0.14
jon
-0.13
will
-0.13
qv
-0.13
tember
-0.13
pte
-0.13
POSITIVE LOGITS
ãģĭãĤı
0.25
akin
0.18
itzer
0.16
edn
0.16
ļ
0.16
--)↵
0.15
Ëĺ
0.15
Falsy
0.15
miêu
0.15
emand
0.15
Activations Density 0.046%