INDEX
    Explanations

    phrases related to emphasizing a specific point or idea

    New Auto-Interp
    Negative Logits
    hens
    -0.77
    tails
    -0.76
    ãĥ¼ãĤ¯
    -0.72
    oran
    -0.71
    obb
    -0.70
    orian
    -0.69
    orah
    -0.68
    uty
    -0.67
    istance
    -0.67
    arest
    -0.66
    POSITIVE LOGITS
     they
    0.88
     pesky
    0.80
     there
    0.79
     THEY
    0.79
     fateful
    0.75
     someday
    0.73
    soever
    0.73
     kind
    0.72
     we
    0.71
     although
    0.70
    Act Density 0.274%

    No Known Activations