INDEX
    Explanations

    phrases questioning the reasoning behind certain actions or statements

    New Auto-Interp
    Negative Logits
    CRET
    -0.17
    éĩı
    -0.17
    aigned
    -0.15
    modal
    -0.14
    leo
    -0.13
    éģķ
    -0.13
     pilot
    -0.13
     çľģ
    -0.13
     opendir
    -0.13
    weeted
    -0.12
    POSITIVE LOGITS
     why
    0.16
    antom
    0.15
    еÑĢб
    0.15
    ownik
    0.15
    why
    0.14
    odox
    0.14
    /how
    0.14
    ưỡng
    0.14
    zeÅĦ
    0.13
    kinci
    0.13
    Act Density 0.036%

    No Known Activations