INDEX
    Explanations

    phrases indicating knowledge or understanding

    instances of the word "knows."

    New Auto-Interp
    Negative Logits
    phrine
    -0.94
    oples
    -0.77
    isco
    -0.69
    unal
    -0.68
    OPLE
    -0.68
    nesota
    -0.68
    anmar
    -0.67
    uld
    -0.67
    interstitial
    -0.67
    adies
    -0.66
    POSITIVE LOGITS
    ledged
    1.04
    ledge
    0.84
    terday
    0.78
     whats
    0.77
     how
    0.72
    lege
    0.71
     exactly
    0.71
    afer
    0.71
    how
    0.70
    LED
    0.67
    Act Density 0.043%

    No Known Activations