INDEX
    Explanations

    now, too, though, here, First

    New Auto-Interp
    Negative Logits
     
    0.38
    /
    0.38
     twenty
    0.33
    _
    0.33
     five
    0.29
    াইব
    0.29
     dozens
    0.29
     twentieth
    0.27
     ove
    0.27
     thirty
    0.27
    POSITIVE LOGITS
    ،
    0.33
     kalau
    0.32
     ,
    0.32
    מ
    0.32
    厳しい
    0.31
    са
    0.31
     ដែល
    0.30
    0.30
    त्न
    0.29
    туры
    0.29
    Act Density 0.247%

    No Known Activations