INDEX
Explanations
phrases indicating relationships or comparisons involving the concept of distance or extent
New Auto-Interp
Negative Logits
acz
-0.16
ishly
-0.16
apro
-0.15
edl
-0.15
bove
-0.15
uito
-0.14
ven
-0.14
adora
-0.14
vens
-0.14
aland
-0.14
POSITIVE LOGITS
concerned
0.64
concern
0.46
Concern
0.42
concerns
0.41
Concern
0.39
cern
0.24
èĢĮ
0.24
concerning
0.24
preocup
0.24
каÑģ
0.22
Activations Density 0.033%