INDEX
Explanations
terms indicating comparison or correspondence between entities or values
New Auto-Interp
Negative Logits
wahr
-0.38
יף
-0.37
skład
-0.36
def
-0.36
fies
-0.36
éges
-0.35
ссора
-0.35
/>);
-0.34
Terms
-0.34
kter
-0.34
POSITIVE LOGITS
OGND
0.90
IntoConstraints
0.90
ligiloj
0.78
المعيارى
0.78
曖昧さ回避
0.77
0.76
estekak
0.75
equivalent
0.74
harusnya
0.73
counterparts
0.72
Activations Density 0.388%