INDEX
    Explanations

    personal pronouns followed by words indicating actions or situations

    repeated references to the word "we."

    New Auto-Interp
    Negative Logits
     Pwr
    -0.76
    trak
    -0.69
     Publication
    -0.67
    oute
    -0.66
     Mehran
    -0.65
    fleet
    -0.60
    fect
    -0.59
    bay
    -0.59
    cart
    -0.58
     Watt
    -0.57
    POSITIVE LOGITS
    're
    1.12
    IRD
    0.94
    asel
    0.92
    aning
    0.92
    athered
    0.89
    selves
    0.85
    asley
    0.84
    eping
    0.84
    bsite
    0.83
    've
    0.82
    Act Density 0.225%

    No Known Activations