INDEX
    Explanations

    proper names and companies

    New Auto-Interp
    Negative Logits
     as
    0.75
     of
    0.67
    ้น
    0.59
    না
    0.59
    یت
    0.58
     มัน
    0.58
     on
    0.57
     at
    0.56
    0.56
    بد
    0.56
    POSITIVE LOGITS
    x
    1.08
    n
    0.98
    ir
    0.96
    w
    0.94
    is
    0.92
    d
    0.90
    il
    0.89
    á
    0.88
    ij
    0.81
    v
    0.78
    Act Density 0.026%

    No Known Activations