INDEX
    Explanations

    expressions prompting personal reflection or self-assessment

    New Auto-Interp
    Negative Logits
    alth
    -0.17
     Sez
    -0.16
    ismet
    -0.15
    ledon
    -0.15
    hare
    -0.15
     konk
    -0.14
    -meta
    -0.14
    ropolis
    -0.14
    edback
    -0.14
    bles
    -0.14
    POSITIVE LOGITS
     whether
    0.17
    agate
    0.17
    zew
    0.14
    482
    0.14
     Whether
    0.14
    gra
    0.14
    leur
    0.14
    gis
    0.14
    chor
    0.14
    orer
    0.13
    Act Density 0.254%

    No Known Activations