INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    тери
    -0.06
    имо
    -0.06
    ure
    -0.06
    de
    -0.06
    arms
    -0.06
     cane
    -0.06
    ули
    -0.06
    des
    -0.06
     busca
    -0.06
    apan
    -0.06
    POSITIVE LOGITS
     took
    0.12
     takes
    0.10
     نیست
    0.07
    0.07
     پژوه
    0.07
     taken
    0.07
     taking
    0.07
     take
    0.07
    Touches
    0.07
    098
    0.07
    Act Density 0.035%

    No Known Activations