INDEX
    Explanations

    the presence of significant keywords or phrases, particularly at the start of sentences or sections

    New Auto-Interp
    Negative Logits
     itſelf
    -0.86
     raiſ
    -0.86
     Anſ
    -0.84
     ―――――
    -0.80
     uſed
    -0.74
     purpoſe
    -0.74
     poffible
    -0.72
     himſelf
    -0.72
     Kanna
    -0.71
     ་་
    -0.71
    POSITIVE LOGITS
     de
    1.12
    بوابة
    0.97
    indd
    0.92
     the
    0.91
    Σε
    0.88
     di
    0.87
     a
    0.82
     OF
    0.79
    '>
    
    0.76
    Hentet
    0.75
    Act Density 0.021%

    No Known Activations