INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     काही
    0.47
     কয়েক
    0.45
     কিছু
    0.43
    0.42
     ۹
    0.41
    StarGo
    0.41
    🥒
    0.41
    ንዳንድ
    0.40
     धीरे
    0.40
    0.39
    POSITIVE LOGITS
    li
    0.40
     sixth
    0.39
     g
    0.37
    6
    0.37
    context
    0.36
    4
    0.35
     ty
    0.35
    class
    0.35
     Ty
    0.35
    sti
    0.35
    Act Density 0.012%

    No Known Activations