INDEX
Explanations
comparisons or quantities for categories
New Auto-Interp
Negative Logits
estine
0.69
was
0.69
told
0.68
đị
0.67
Entries
0.67
posterous
0.66
Tudo
0.66
্যাপ
0.64
According
0.64
astia
0.64
POSITIVE LOGITS
lebih
3.48
greater
3.47
更高的
3.44
better
3.42
more
3.31
hơn
3.29
더
3.28
clearer
3.24
更好的
3.23
fewer
3.17
Activations Density 2.591%