INDEX
    Explanations

    sentences that begin with the word "The."

    New Auto-Interp
    Negative Logits
     Intercept
    -0.15
    ney
    -0.14
    sheet
    -0.13
    roll
    -0.13
    rief
    -0.13
    ilia
    -0.13
    folder
    -0.13
    stem
    -0.13
    arious
    -0.13
    798
    -0.12
    POSITIVE LOGITS
    purpose
    0.17
     purpose
    0.17
    ater
    0.16
     Dün
    0.16
    ostel
    0.16
     본
    0.15
    ouro
    0.14
     Anatomy
    0.14
    andle
    0.14
    'gc
    0.14
    Act Density 0.142%

    No Known Activations