INDEX
    Explanations

    statements indicating reasons or justifications for a particular situation or action

    instances of causal explanations or reasons

    New Auto-Interp
    Negative Logits
    Eye
    -0.69
    mint
    -0.68
    shaw
    -0.67
    yan
    -0.65
    thal
    -0.62
    Luc
    -0.62
    se
    -0.61
    nin
    -0.60
    vous
    -0.59
    lem
    -0.58
    POSITIVE LOGITS
    */(
    0.96
    endment
    0.88
    assetsadobe
    0.84
    akening
    0.75
    ifference
    0.73
    xual
    0.73
    uristic
    0.72
    ecause
    0.72
    arcity
    0.70
     proxies
    0.70
    Act Density 0.073%

    No Known Activations