INDEX
    Explanations

    references to page numbers or citations in texts

    New Auto-Interp
    Negative Logits
    наÑĢод
    -0.15
    -Token
    -0.14
    altet
    -0.14
    VD
    -0.14
    @nate
    -0.14
    XA
    -0.14
    à¸Ļà¸ģ
    -0.13
     Pey
    -0.13
    à¹ĩà¸Ķ
    -0.13
    _UNUSED
    -0.13
    POSITIVE LOGITS
     vi
    0.17
    3
    0.16
     Kindle
    0.16
     unp
    0.15
     ii
    0.15
    hana
    0.14
    34
    0.14
    ainer
    0.14
     facing
    0.14
    76
    0.14
    Act Density 0.055%

    No Known Activations