INDEX
    Explanations

    comparisons using the word "like"

    phrases that indicate similarity or comparison

    New Auto-Interp
    Negative Logits
    eele
    -0.62
     allocated
    -0.58
     Urban
    -0.57
     Limited
    -0.56
     util
    -0.56
     invited
    -0.55
     UK
    -0.55
     Haw
    -0.55
     available
    -0.54
     scheduled
    -0.53
    POSITIVE LOGITS
    like
    3.93
    esque
    1.95
    Like
    1.77
     LIKE
    1.63
    shaped
    1.39
     Like
    1.39
     like
    1.38
    style
    1.31
    fortunately
    1.30
    similar
    1.24
    Act Density 0.008%

    No Known Activations