INDEX
    Explanations

    instances of the word "love" and its variations

    New Auto-Interp
    Negative Logits
    urch
    -0.18
     ëı
    -0.17
    iff
    -0.16
    æijĩ
    -0.15
    erer
    -0.15
    Ùħا
    -0.15
    iffs
    -0.14
     kontakte
    -0.14
     prés
    -0.14
    geber
    -0.14
    POSITIVE LOGITS
    eliness
    0.23
    ely
    0.19
    vv
    0.18
    renc
    0.18
    ullo
    0.18
    alker
    0.16
    ett
    0.16
    ell
    0.16
    ohl
    0.16
    ania
    0.15
    Act Density 0.005%

    No Known Activations