INDEX
Explanations
phrases indicating differentiation or distinction between subjects or entities
New Auto-Interp
Negative Logits
Swanson
-0.58
Haber
-0.52
cas
-0.51
нансо
-0.48
occasionally
-0.47
Ban
-0.45
Hold
-0.44
WithFormat
-0.44
Sto
-0.43
qrt
-0.42
POSITIVE LOGITS
distinguishes
1.26
differentiating
1.24
differentiates
1.23
distinguish
1.21
differentiate
1.21
distinguishing
1.19
distingue
1.18
distinguer
1.15
differ
1.13
diferen
1.10
Activations Density 0.217%