INDEX
Explanations
comparative phrases regarding individuals and their roles or behaviors in various contexts
difference or dissimilarity
describing differences
New Auto-Interp
Negative Logits
\{\\-0.61
-0.60
__':
-0.59
CreateTagHelper
-0.59
uxxxx
-0.58
Insee
-0.56
য়ে
-0.51
mattino
-0.51
թվական
-0.49
श्यक
-0.49
POSITIVE LOGITS
differs
2.24
differ
1.99
differed
1.90
different
1.76
Different
1.69
Different
1.69
different
1.68
Differ
1.58
DIFFERENT
1.55
diferente
1.53
Activations Density 0.967%