INDEX
    Explanations

    the word "reason" in various forms, indicating explanations or justifications

    New Auto-Interp
    Negative Logits
    haustible
    -0.93
    orgeous
    -0.92
    }$​
    -0.90
     الحره
    -0.88
    zsef
    -0.86
    extAlignment
    -0.86
     كومونز
    -0.86
    $​
    -0.85
     समीक्षक
    -0.85
    edipus
    -0.84
    POSITIVE LOGITS
     reasons
    1.94
     reason
    1.80
     Reasons
    1.72
     REASON
    1.68
    reasons
    1.66
     Reason
    1.66
    reason
    1.56
    Reason
    1.54
    Reasons
    1.52
     REASONS
    1.49
    Act Density 0.092%

    No Known Activations