INDEX
    Explanations

    phrases indicating comparisons and expressions of identity

    New Auto-Interp
    Negative Logits
    obe
    -0.18
    ahren
    -0.17
    itos
    -0.15
    bian
    -0.15
     EOF
    -0.15
    icit
    -0.14
    ä»ķ
    -0.14
    éİ®
    -0.14
    .construct
    -0.14
    Parts
    -0.14
    POSITIVE LOGITS
     bergen
    0.16
    alous
    0.15
     Mot
    0.14
    SZ
    0.14
    gf
    0.14
    ż
    0.14
     Ap
    0.13
     breathed
    0.13
    vore
    0.13
    Ap
    0.13
    Act Density 0.436%

    No Known Activations