INDEX
    Explanations

    sentences starting with the word "So"

    New Auto-Interp
    Negative Logits
    saf
    -0.73
    thro
    -0.64
     nic
    -0.60
    sk
    -0.58
    exclusive
    -0.57
     Purg
    -0.57
     Halls
    -0.56
    degree
    -0.56
    ski
    -0.56
    âĹ¼
    -0.55
    POSITIVE LOGITS
    oner
    1.31
    bered
    1.09
    fter
    1.07
    FTWARE
    1.04
    apy
    1.02
    ooo
    0.96
    oths
    0.96
    othes
    0.95
     far
    0.92
    aring
    0.90
    Act Density 0.228%

    No Known Activations