INDEX
Explanations
references to variations or differences among entities or concepts
New Auto-Interp
Negative Logits
another
-0.08
another
-0.08
دÛĮگرÛĮ
-0.07
omething
-0.07
something
-0.07
ruit
-0.07
Another
-0.07
bersome
-0.07
bove
-0.06
something
-0.06
POSITIVE LOGITS
different
0.22
ä¸įåIJĮçļĦ
0.18
different
0.17
diferentes
0.16
varying
0.15
Different
0.15
ä¸įåIJĮ
0.15
Different
0.15
ÑĢазнÑĭÑħ
0.14
ÙħختÙĦÙģ
0.14
Activations Density 0.046%