INDEX
    Explanations

    representations of colors in the text

    New Auto-Interp
    Negative Logits
    chan
    -0.20
     basket
    -0.18
     Basket
    -0.16
    æĪ¸
    -0.16
    engo
    -0.16
    ongo
    -0.15
    .Areas
    -0.14
    ula
    -0.14
    oot
    -0.14
     Äįin
    -0.14
    POSITIVE LOGITS
    mia
    0.17
    istical
    0.16
    avirus
    0.16
    isphere
    0.15
    ucs
    0.14
    voke
    0.14
    ataire
    0.14
    ecycle
    0.14
    istique
    0.14
    اعد
    0.14
    Act Density 0.013%

    No Known Activations