INDEX
Explanations
sections or labels typically found in formal documents or academic publications
New Auto-Interp
Negative Logits
chw
-0.18
preci
-0.15
Hra
-0.14
elpers
-0.14
lement
-0.14
joint
-0.13
جاÙħ
-0.13
157
-0.13
civil
-0.13
reh
-0.13
POSITIVE LOGITS
Ñģи
0.18
syn
0.18
.syn
0.17
eci
0.17
Syn
0.17
-tax
0.16
tax
0.16
åºĥ
0.16
ntax
0.16
synonym
0.15
Activations Density 0.011%