INDEX
    Explanations

    references to numerical data or identifiers

    New Auto-Interp
    Negative Logits
     nghĩa
    -0.16
    sah
    -0.15
    RI
    -0.15
    ever
    -0.15
    issant
    -0.15
    ODE
    -0.14
    sed
    -0.14
    UCK
    -0.14
    acht
    -0.14
    aktu
    -0.14
    POSITIVE LOGITS
     Dare
    0.16
    ouri
    0.15
    ól
    0.14
    eneg
    0.14
     Mant
    0.14
    afort
    0.14
    orda
    0.14
    zk
    0.13
    pcodes
    0.13
    rescia
    0.13
    Act Density 0.035%

    No Known Activations