INDEX
    Explanations

    phrases indicating significant revelations or identity changes in a narrative context

    New Auto-Interp
    Negative Logits
    lect
    -0.15
    Trigger
    -0.15
    £i
    -0.15
    ãģ§ãģĹãĤĩãģĨ
    -0.14
    regs
    -0.14
    ect
    -0.13
    #
    -0.13
    etooth
    -0.13
     Alone
    -0.13
     alone
    -0.13
    POSITIVE LOGITS
     unb
    0.19
     secret
    0.18
    quires
    0.17
     secretly
    0.17
     Secret
    0.16
    secret
    0.15
    uther
    0.15
    ecret
    0.15
     footer
    0.15
    -secret
    0.14
    Act Density 0.181%

    No Known Activations