INDEX
    Explanations

    variations of the word "have" in different contexts

    New Auto-Interp
    Negative Logits
    i
    -0.21
    iya
    -0.19
    egr
    -0.19
    iffin
    -0.17
    ec
    -0.16
    enie
    -0.16
    auf
    -0.16
    alette
    -0.16
    udeau
    -0.15
    alink
    -0.15
    POSITIVE LOGITS
    vy
    0.23
    oir
    0.23
    olution
    0.22
    irtual
    0.21
    à¥įह
    0.20
    ersion
    0.19
    engers
    0.19
    ell
    0.18
    est
    0.18
    oice
    0.18
    Act Density 0.034%

    No Known Activations