INDEX
Explanations
comparisons and degrees of difference
New Auto-Interp
Negative Logits
diferentes
0.50
irgendwie
0.48
ichever
0.48
Lots
0.46
得很
0.46
Increasing
0.45
different
0.44
ほとんど
0.44
somehow
0.44
różnych
0.43
POSITIVE LOGITS
quite
0.85
Quite
0.77
quite
0.75
Quite
0.73
comparable
0.70
compar
0.66
anywhere
0.64
igual
0.63
equivalently
0.61
столь
0.57
Activations Density 0.027%