INDEX
Explanations
words associated with adjectives and their various forms, particularly those indicating qualities, conditions, and characteristics
New Auto-Interp
Negative Logits
ìĿĦ
-0.21
lo
-0.20
ses
-0.20
Ìĥ
-0.19
ers
-0.19
soever
-0.19
ings
-0.19
ma
-0.19
ra
-0.18
-0.18
POSITIVE LOGITS
-minded
0.19
y
0.17
ALLY
0.17
amente
0.17
ity
0.16
/select
0.16
yas
0.16
-looking
0.16
ourt
0.15
elyn
0.15
Activations Density 0.143%