INDEX
    Explanations

    instances of dishonest behavior, specifically cheating

    terms related to dishonest behavior and cheating

    New Auto-Interp
    Negative Logits
    ŃĶ
    -0.76
    area
    -0.71
    escal
    -0.70
    areth
    -0.69
    oran
    -0.67
    Vert
    -0.67
    entin
    -0.63
    rez
    -0.62
     vig
    -0.62
    eric
    -0.61
    POSITIVE LOGITS
     cheating
    0.88
     cheat
    0.81
     cheated
    0.80
    ulence
    0.78
     loophole
    0.72
    raud
    0.71
    ulent
    0.70
     exploited
    0.68
     loopholes
    0.67
     wrongdoing
    0.65
    Act Density 0.111%

    No Known Activations