INDEX
    Explanations

    terms related to expectations, conditions, and potential actions in discourse

    New Auto-Interp
    Negative Logits
    .fd
    -0.16
     Bou
    -0.16
    imar
    -0.16
    ested
    -0.15
    ieres
    -0.15
    ulet
    -0.15
    umbles
    -0.15
    isher
    -0.15
     Svens
    -0.14
    Ñīик
    -0.14
    POSITIVE LOGITS
    ζα
    0.15
    aload
    0.15
    lander
    0.15
    asil
    0.14
    echa
    0.14
     Torrent
    0.14
     anything
    0.14
    ÙĦÙĪØ¯
    0.14
     kostenlose
    0.14
    reinterpret
    0.13
    Act Density 0.004%

    No Known Activations