INDEX
    Explanations

    references to betrayal and moral conflict

    New Auto-Interp
    Negative Logits
    ksen
    -0.16
     harass
    -0.16
    omedical
    -0.15
    ictim
    -0.15
    amework
    -0.14
     bumper
    -0.14
    arge
    -0.14
    رÙĤ
    -0.14
     harassment
    -0.14
    agle
    -0.13
    POSITIVE LOGITS
     trait
    0.67
     betray
    0.54
    trait
    0.54
     betrayal
    0.53
     Bet
    0.47
     betr
    0.47
     betrayed
    0.46
    bet
    0.44
     tre
    0.43
    _trait
    0.43
    Act Density 0.339%

    No Known Activations