INDEX
    Explanations

    modal verbs and auxiliaries

    New Auto-Interp
    Negative Logits
     updating
    0.38
    ஒரு
    0.36
    update
    0.36
    include
    0.35
    ya
    0.35
     itself
    0.34
     underside
    0.34
    params
    0.34
    neath
    0.34
     an
    0.34
    POSITIVE LOGITS
     themselves
    0.47
     flock
    0.44
     وطالبات
    0.43
     ktorí
    0.43
     получают
    0.42
    ಿದ್ದಾರೆ
    0.41
     této
    0.40
     kteří
    0.40
     často
    0.40
    纷纷
    0.38
    Act Density 0.165%

    No Known Activations