INDEX
    Explanations

    patterns related to reasoning and justification

    "[reason]" or "[reasons]" after certain words

    New Auto-Interp
    Negative Logits
    SourceChecksum
    -0.83
     estekak
    -0.82
     Roskov
    -0.73
     referenties
    -0.72
     esternos
    -0.65
     nahilalakip
    -0.65
    CppMethod
    -0.63
    Diwedd
    -0.62
    sätzlich
    -0.61
    ########.
    -0.59
    POSITIVE LOGITS
     reason
    3.02
     reasons
    2.73
    reason
    2.41
    Reason
    2.17
     Reason
    2.16
    reasons
    2.12
     Reasons
    2.09
     REASON
    2.02
    Reasons
    1.95
     REASONS
    1.86
    Act Density 0.411%

    No Known Activations