INDEX
    Explanations

    categories followed by prepositions

    New Auto-Interp
    Negative Logits
     this
    0.48
     هذا
    0.46
     этом
    0.44
    这也是
    0.44
     этими
    0.42
     questa
    0.41
     đây
    0.40
     này
    0.40
     hepin
    0.39
    {-#
    0.38
    POSITIVE LOGITS
     of
    1.01
    នៃ
    0.84
    ของการ
    0.82
     של
    0.79
    នៃការ
    0.72
    of
    0.71
     của
    0.70
     ഓഫ്
    0.67
     của
    0.63
     về
    0.61
    Act Density 0.126%

    No Known Activations