INDEX
    Explanations

    verbs related to communication or interaction

    New Auto-Interp
    Negative Logits
     suppose
    -0.62
    terday
    -0.56
     namely
    -0.54
     anchored
    -0.54
     hosted
    -0.52
     applied
    -0.52
     viz
    -0.52
     awaited
    -0.51
    she
    -0.51
     requiring
    -0.50
    POSITIVE LOGITS
     oneself
    1.05
     ourselves
    1.03
     yourselves
    1.03
     yourself
    1.00
     them
    0.93
    ulate
    0.89
     themselves
    0.88
    ãĥĥãĥī
    0.82
    igate
    0.81
    iate
    0.80
    Act Density 2.402%

    No Known Activations