INDEX
    Explanations

    phrases that begin with the word "So."

    New Auto-Interp
    Negative Logits
    duk
    -0.15
     dabei
    -0.15
    icaret
    -0.15
    ponse
    -0.15
    idis
    -0.15
    IONS
    -0.14
     so
    -0.14
    lon
    -0.14
    erse
    -0.14
    nty
    -0.14
    POSITIVE LOGITS
    oner
    0.26
    -called
    0.24
    ftware
    0.20
    aked
    0.20
    aring
    0.19
     although
    0.19
     instead
    0.19
     far
    0.19
    apy
    0.18
    fter
    0.18
    Act Density 0.045%

    No Known Activations