INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _critical
    -0.08
     ounce
    -0.07
     здат
    -0.07
    feature
    -0.07
    -0.07
    aneous
    -0.07
     estimator
    -0.06
    غيرة
    -0.06
    gens
    -0.06
    اره
    -0.06
    POSITIVE LOGITS
     Chúa
    0.07
     Woo
    0.07
    <Button
    0.07
     возможности
    0.06
    BACKGROUND
    0.06
    ---@
    0.06
     være
    0.06
     việc
    0.06
    ----------</
    0.06
     italiano
    0.06
    Act Density 0.037%

    No Known Activations