INDEX
    Explanations

    references to people beyond oneself or one's immediate circle

    New Auto-Interp
    Negative Logits
     otherwise
    -0.18
    edly
    -0.17
     itself
    -0.17
    rail
    -0.15
    åı¦ä¸Ģ
    -0.15
    swers
    -0.14
    ibur
    -0.14
     Other
    -0.14
    entai
    -0.14
     autre
    -0.14
    POSITIVE LOGITS
    -than
    0.22
    most
    0.20
    world
    0.19
    wis
    0.19
    /new
    0.19
    /all
    0.18
    ness
    0.18
    bes
    0.18
     besides
    0.18
    elves
    0.17
    Act Density 0.042%

    No Known Activations