INDEX
    Explanations

    phrases that include the word "Good"

    New Auto-Interp
    Negative Logits
    laz
    -0.18
    adora
    -0.15
    ufen
    -0.15
     Fur
    -0.14
     Bash
    -0.14
    pic
    -0.14
    uhl
    -0.14
     Robertson
    -0.14
    agic
    -0.13
    adu
    -0.13
    POSITIVE LOGITS
    reads
    0.29
    bye
    0.27
    onya
    0.22
    win
    0.21
     Samar
    0.20
    ness
    0.19
    acre
    0.19
    night
    0.18
    ie
    0.17
     intentions
    0.17
    Act Density 0.044%

    No Known Activations