INDEX
    Explanations

    imperative verbs and function definitions in code

    New Auto-Interp
    Negative Logits
    amon
    -0.17
    tember
    -0.16
    ancies
    -0.16
     jich
    -0.16
    à¹Ģส
    -0.15
    Ú©ÛĮ
    -0.14
    ÑıÑĩ
    -0.14
    é¡į
    -0.14
    گرÛĮ
    -0.14
    ÏĩÏİ
    -0.14
    POSITIVE LOGITS
    0.20
    :↵
    0.16
    ince
    0.16
    zens
    0.16
     not
    0.16
    19
    0.16
    oley
    0.15
       
    0.15
    oulos
    0.15
     Cong
    0.15
    Act Density 0.005%

    No Known Activations