INDEX
    Explanations

    the word "I" as a way of identifying personal statements or subjective experiences

    New Auto-Interp
    Negative Logits
    O
    -0.15
    avenport
    -0.14
    pher
    -0.14
    borough
    -0.14
    ses
    -0.14
    mont
    -0.13
    .Lib
    -0.13
    cca
    -0.13
    Du
    -0.13
    cco
    -0.13
    POSITIVE LOGITS
    weg
    0.17
    šker
    0.15
    968
    0.15
    368
    0.15
    ubo
    0.14
    olina
    0.14
    adays
    0.14
    aklı
    0.14
    sled
    0.14
    iag
    0.14
    Act Density 0.154%

    No Known Activations