INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ۔
    0.64
    0.64
     ذریع
    0.62
    s
    0.62
    0.60
    ीय
    0.57
     کریں۔
    0.57
    0.56
     in
    0.55
    0.55
    POSITIVE LOGITS
     Toilet
    0.78
     toilets
    0.76
    at
    0.75
    م
    0.74
    it
    0.68
    м
    0.65
    m
    0.64
    štění
    0.63
    ak
    0.62
    N
    0.61
    Act Density 0.001%

    No Known Activations