INDEX
    Explanations

    instances of self-reference and personal commentary

    New Auto-Interp
    Negative Logits
    æ²¢
    -0.15
    lex
    -0.15
     Erd
    -0.15
    eree
    -0.15
     Lar
    -0.14
     Shak
    -0.14
     Steele
    -0.14
     Sil
    -0.14
    subtype
    -0.14
    ilar
    -0.14
    POSITIVE LOGITS
     above
    0.48
    above
    0.42
     ABOVE
    0.39
     Above
    0.39
    Above
    0.39
    以ä¸Ĭ
    0.34
    bove
    0.32
    _above
    0.32
     вÑĭÑĪе
    0.32
     foregoing
    0.31
    Act Density 0.156%

    No Known Activations