INDEX
    Explanations

    references to scientific studies and research findings

    New Auto-Interp
    Negative Logits
    <^
    -0.58
    uestions
    -0.55
    oO
    -0.53
     |]
    -0.53
    Corresponding
    -0.52
    Faktor
    -0.52
     Sot
    -0.51
    :,,
    -0.50
     «<
    -0.48
     sii
    -0.47
    POSITIVE LOGITS
    relenting
    1.00
    sightly
    0.94
    mistak
    0.83
    wavering
    0.80
    thinkable
    0.79
     vestiti
    0.73
     affez
    0.73
    <bos>
    0.72
     fidanz
    0.70
     sentito
    0.69
    Act Density 0.327%

    No Known Activations