INDEX
    Explanations

    phrases that describe actions or states occurring simultaneously

    New Auto-Interp
    Negative Logits
    bourg
    -0.19
    oggles
    -0.15
    ysi
    -0.15
    ollower
    -0.15
    ulis
    -0.14
    heimer
    -0.14
    alette
    -0.14
    ınızda
    -0.14
    cec
    -0.14
    evice
    -0.14
    POSITIVE LOGITS
    eder
    0.15
    aph
    0.15
     they
    0.15
    bows
    0.15
     doors
    0.14
     trains
    0.14
     cupid
    0.14
    pret
    0.14
    ews
    0.14
    ph
    0.14
    Act Density 0.081%

    No Known Activations