INDEX
    Explanations

    instances of strong emotional reactions or experiences

    New Auto-Interp
    Negative Logits
    Neighbors
    -0.19
     Favorite
    -0.19
    Favorite
    -0.17
    favorite
    -0.17
     Flavor
    -0.17
     coloring
    -0.17
     à¹Ĩ
    -0.17
     colorful
    -0.16
     favorite
    -0.16
    neighbor
    -0.16
    POSITIVE LOGITS
     uk
    0.19
     yesterday
    0.17
    .uk
    0.15
     UK
    0.15
    uk
    0.15
    READ
    0.15
    abay
    0.15
    (Image
    0.14
    erator
    0.14
    erli
    0.14
    Act Density 0.058%

    No Known Activations