INDEX
Explanations
phrases indicating specificity and precision
New Auto-Interp
Negative Logits
ต่างๆ
-0.43
gröss
-0.42
múltiples
-0.42
อื่น
-0.41
otros
-0.41
multiple
-0.41
africaine
-0.39
ansatte
-0.39
other
-0.39
其他
-0.39
POSITIVE LOGITS
exactly
0.82
EXACTLY
0.79
exactly
0.77
виправивши
0.76
Exactly
0.75
Exactly
0.73
precisely
0.73
Genau
0.68
JUST
0.68
exatamente
0.67
Activations Density 0.182%