INDEX
    Explanations

    directives like `@name`, `@inputs`

    New Auto-Interp
    Negative Logits
    ():
    0.91
    ):
    0.84
    .):
    0.80
    **:
    0.77
    !:
    0.75
    *:
    0.74
    .:
    0.73
    }:
    0.71
    :
    0.68
    :</
    0.67
    POSITIVE LOGITS
     освіти
    0.39
     נ
    0.38
     쓰고
    0.38
    込み
    0.38
    0.38
     ק
    0.37
     Gründen
    0.37
     어떻게
    0.36
     ע
    0.36
     تحديد
    0.36
    Act Density 0.023%

    No Known Activations