INDEX
    Explanations

    phrases related to first impressions

    references to initial impressions or superficial observations

    New Auto-Interp
    Negative Logits
    icer
    -0.76
    tailed
    -0.73
    ammy
    -0.73
    rus
    -0.70
    onement
    -0.69
    winter
    -0.69
    rop
    -0.66
    lez
    -0.66
    orders
    -0.65
    Joined
    -0.65
    POSITIVE LOGITS
     blush
    0.89
     glance
    0.89
     superf
    0.84
     IMAGES
    0.78
     premise
    0.71
    intuitive
    0.67
     intuitive
    0.66
    ILD
    0.65
    ANE
    0.64
     understandable
    0.64
    Act Density 0.138%

    No Known Activations