INDEX
    Explanations

    phrases indicating affirmation or emphasis

    phrases that indicate reasoning or justification for actions and situations

    New Auto-Interp
    Negative Logits
    robe
    -0.79
    uttering
    -0.67
    still
    -0.65
    istically
    -0.62
    lich
    -0.62
    ature
    -0.62
    jj
    -0.61
    ãĥ¼ãĥ³
    -0.61
    eer
    -0.60
    UNE
    -0.60
    POSITIVE LOGITS
     happens
    1.13
    soever
    1.13
     happened
    1.06
     happ
    0.94
     separates
    0.83
     transpired
    0.77
     distinguishes
    0.76
    atus
    0.73
     motiv
    0.73
     Happ
    0.73
    Act Density 0.037%

    No Known Activations