INDEX
    Explanations

    phrases indicating comparisons and contrasts

    New Auto-Interp
    Negative Logits
    ÙĪØ±Ø´
    -0.14
     Jackie
    -0.14
     pard
    -0.14
    lish
    -0.14
    å¤
    -0.14
    231
    -0.13
    oit
    -0.13
    иÑĪ
    -0.13
    ë¹Ľ
    -0.13
     tro
    -0.13
    POSITIVE LOGITS
     gesture
    0.21
     attempt
    0.19
     way
    0.19
     measure
    0.18
     contribution
    0.18
     part
    0.17
     gift
    0.17
     Gesture
    0.16
     response
    0.16
     Contribution
    0.15
    Act Density 0.144%

    No Known Activations