INDEX
    Explanations

    comparisons or quantities for categories

    New Auto-Interp
    Negative Logits
    estine
    0.69
     was
    0.69
     told
    0.68
     đị
    0.67
     Entries
    0.67
    posterous
    0.66
     Tudo
    0.66
    ্যাপ
    0.64
     According
    0.64
    astia
    0.64
    POSITIVE LOGITS
     lebih
    3.48
     greater
    3.47
    更高的
    3.44
     better
    3.42
     more
    3.31
     hơn
    3.29
    3.28
     clearer
    3.24
    更好的
    3.23
     fewer
    3.17
    Act Density 2.591%

    No Known Activations