INDEX
    Explanations

    reasons or justifications for a situation or phenomenon

    New Auto-Interp
    Negative Logits
    illet
    -1.22
    jet
    -1.06
    engeance
    -1.03
    estial
    -1.00
    luster
    -0.99
    emp
    -0.93
    ammy
    -0.93
    nir
    -0.91
    ategory
    -0.91
    mire
    -0.91
    POSITIVE LOGITS
     why
    1.45
     WHY
    1.31
    udic
    1.09
    why
    1.08
    ĸļ
    1.07
     how
    1.01
    ance
    1.01
    cases
    0.99
    ¿½
    0.98
     explanations
    0.97
    Act Density 0.888%

    No Known Activations