INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Strict
    -0.10
    jež
    -0.08
    zeg
    -0.08
     boys
    -0.08
     Efficiency
    -0.07
    acam
    -0.07
     Ways
    -0.07
     succès
    -0.07
    orspr
    -0.07
    Strict
    -0.07
    POSITIVE LOGITS
     adjustments
    0.09
     angepasst
    0.08
    OFFSET
    0.08
    offset
    0.08
    িমাণ
    0.08
    adjust
    0.08
     offset
    0.08
     compensation
    0.08
     HE
    0.08
     adjustable
    0.08
    Act Density 0.023%

    No Known Activations