INDEX
    Explanations

    boolean values indicating true or false conditions

    New Auto-Interp
    Negative Logits
    emer
    -0.19
    mand
    -0.16
    rary
    -0.15
    Clo
    -0.15
    avery
    -0.14
    esc
    -0.14
    ä¹
    -0.14
    ret
    -0.13
     Sund
    -0.13
    rels
    -0.13
    POSITIVE LOGITS
    izoph
    0.17
    /false
    0.17
    ushima
    0.16
    odoxy
    0.16
    STALL
    0.16
    setattr
    0.15
    ongs
    0.15
    reesome
    0.15
    entiful
    0.14
    ToMany
    0.14
    Act Density 0.030%

    No Known Activations