INDEX
    Explanations

    various forms of the verb "be" in different contexts

    New Auto-Interp
    Negative Logits
    ÑĭÑģ
    -0.17
    mia
    -0.16
    urm
    -0.15
    vided
    -0.14
    beck
    -0.14
    æ¡£
    -0.14
    erti
    -0.14
    undi
    -0.13
    kowski
    -0.13
    rine
    -0.13
    POSITIVE LOGITS
    ardless
    0.18
    ying
    0.17
     sure
    0.16
    oga
    0.15
    eh
    0.15
    asts
    0.15
    Ù쨳
    0.14
    friend
    0.14
    ething
    0.14
    oted
    0.14
    Act Density 0.061%

    No Known Activations