INDEX
    Explanations

    phrases indicating inquiry or seeking information

    New Auto-Interp
    Negative Logits
    achi
    -0.17
    .Proxy
    -0.17
    jÃł
    -0.15
    enu
    -0.14
    NET
    -0.14
    monds
    -0.13
    iei
    -0.13
     voks
    -0.13
     '\''
    -0.13
    inki
    -0.13
    POSITIVE LOGITS
     whether
    0.20
    uil
    0.17
     why
    0.16
     about
    0.16
    whether
    0.16
     how
    0.15
    ENER
    0.15
    θι
    0.14
    .jdesktop
    0.14
    _about
    0.14
    Act Density 0.048%

    No Known Activations