INDEX
    Explanations

    references to coefficients in mathematical expressions or equations

    New Auto-Interp
    Negative Logits
    cristo
    -0.65
    باه
    -0.64
    "</
    -0.63
    away
    -0.62
    imas
    -0.61
    Caj
    -0.59
    around
    -0.58
    openqa
    -0.57
    σκ
    -0.57
    𝘪
    -0.55
    POSITIVE LOGITS
     coefficients
    1.59
     Coefficients
    1.48
     Coefficient
    1.46
    coefficients
    1.42
    fficients
    1.41
     coefficient
    1.40
    Coefficient
    1.37
    coefficient
    1.37
     coeff
    1.22
    coeff
    1.13
    Act Density 0.020%

    No Known Activations