INDEX
    Explanations

    mean or rate related to outcome

    New Auto-Interp
    Negative Logits
    álním
    0.46
    ('.')[
    0.41
    ichtig
    0.40
     пропор
    0.40
    Bracket
    0.39
    হোক
    0.39
    括号
    0.38
    Between
    0.38
     стала
    0.38
     علاق
    0.38
    POSITIVE LOGITS
     very
    0.49
     almost
    0.45
     barely
    0.45
     apenas
    0.42
     slechts
    0.41
    низ
    0.41
     low
    0.40
     amort
    0.39
     lower
    0.39
     much
    0.38
    Act Density 0.006%

    No Known Activations