INDEX
    Explanations

    reasons or justifications

    phrases that specify reasons or justifications

    New Auto-Interp
    Negative Logits
    yss
    -0.83
    ipher
    -0.81
    mint
    -0.80
    bats
    -0.80
    thumbnails
    -0.77
     ILCS
    -0.74
    owship
    -0.74
    OLOGY
    -0.74
    achus
    -0.70
    hem
    -0.70
    POSITIVE LOGITS
     variance
    0.81
     causation
    0.72
     reasoning
    0.71
     why
    0.71
     justify
    0.67
     discrimination
    0.67
     inaction
    0.66
     cite
    0.66
     explan
    0.65
     preferring
    0.65
    Act Density 0.137%

    No Known Activations