INDEX
Explanations
adjectives that denote suitability or perfection for specific purposes
New Auto-Interp
Negative Logits
AWN
-0.15
ouri
-0.15
ridor
-0.15
OWER
-0.15
_encoded
-0.14
ẩu
-0.14
imento
-0.14
istar
-0.13
elib
-0.13
inh
-0.13
POSITIVE LOGITS
for
0.17
642
0.13
675
0.13
215
0.13
dla
0.13
adle
0.13
ashamed
0.13
.shell
0.13
für
0.13
ler
0.13
Activations Density 0.078%