INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Bari
    0.77
    b
    0.76
     b
    0.76
     bbs
    0.71
    0.71
    𝑏
    0.68
    ભાઈ
    0.67
    บบ
    0.65
     छी
    0.65
    heba
    0.65
    POSITIVE LOGITS
     ро
    0.73
     Oxid
    0.72
    ټ
    0.68
     пи
    0.66
    orty
    0.66
    éra
    0.65
    rots
    0.64
    piro
    0.64
    0.64
     ауто
    0.63
    Act Density 0.156%

    No Known Activations