INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ቱም
    0.44
     السبب
    0.41
    інки
    0.40
    之所以
    0.38
    ાસ
    0.37
     dichos
    0.36
     needn
    0.36
    maked
    0.34
    ետ
    0.34
    }.$
    0.34
    POSITIVE LOGITS
     What
    2.23
    What
    2.17
     what
    2.09
    what
    1.92
     How
    1.91
     how
    1.79
    How
    1.79
     क्या
    1.73
    どのような
    1.64
     ماذا
    1.59
    Act Density 0.103%

    No Known Activations