INDEX
    Explanations

    attention after 'of' or 's'

    New Auto-Interp
    Negative Logits
    ৈত
    0.63
     हड्ड
    0.62
    խ
    0.62
     छु
    0.61
    0.59
     ক্যান্ট
    0.58
    ]%
    0.58
    zsche
    0.58
     malle
    0.58
     शर्त
    0.58
    POSITIVE LOGITS
     attention
    3.56
     Attention
    3.19
    attention
    3.15
    Attention
    3.12
     внимание
    2.94
     внимания
    2.80
     Aufmerksamkeit
    2.79
     attentions
    2.78
    attenzione
    2.76
     atenção
    2.68
    Act Density 0.474%

    No Known Activations