INDEX
    Explanations

    references to secrets and hidden information

    New Auto-Interp
    Negative Logits
    ünkü
    -0.15
    eo
    -0.15
    OPTIONS
    -0.13
    ÙĪÙĨد
    -0.13
    562
    -0.13
    ayo
    -0.13
    Pragma
    -0.13
    acia
    -0.13
    же
    -0.13
    359
    -0.12
    POSITIVE LOGITS
     secret
    0.77
     secrets
    0.73
     Secret
    0.60
    secret
    0.60
     Secrets
    0.58
    ç§ĺ
    0.57
    -secret
    0.56
    Secret
    0.56
     SECRET
    0.55
     secre
    0.53
    Act Density 0.196%

    No Known Activations