INDEX
Explanations
comparative adjectives and phrases signaling improvement or decline
New Auto-Interp
Negative Logits
eron
-0.17
shop
-0.16
uzey
-0.16
equivalents
-0.14
rastructure
-0.14
Що
-0.14
/Foundation
-0.14
uite
-0.14
edBy
-0.14
uren
-0.14
POSITIVE LOGITS
tình
0.15
how
0.14
gni
0.14
ến
0.14
ruz
0.14
aga
0.14
apo
0.13
paras
0.13
Bottom
0.13
tj
0.13
Activations Density 0.078%