INDEX
    Explanations

    preferences and choices in contexts of liking or favoring something

    New Auto-Interp
    Negative Logits
    ñ
    -0.15
    about
    -0.14
     pari
    -0.14
    nad
    -0.14
    mand
    -0.14
    fig
    -0.14
    kowski
    -0.14
    ekyll
    -0.14
    ader
    -0.14
    á»ijt
    -0.14
    POSITIVE LOGITS
    entially
    0.29
    ential
    0.20
    ably
    0.19
    à¸Ĭม
    0.17
    /pre
    0.17
    erguson
    0.16
    iable
    0.15
    Cog
    0.15
    -than
    0.14
    peare
    0.14
    Act Density 0.033%

    No Known Activations