INDEX
    Explanations

    complex and abstract concepts, particularly those related to existential themes

    New Auto-Interp
    Negative Logits
    ily
    -0.17
    odcast
    -0.16
    arkan
    -0.15
     بس
    -0.15
    Ai
    -0.15
    NECT
    -0.15
    aring
    -0.14
     Bulk
    -0.14
    λÎŃ
    -0.14
    ιλ
    -0.14
    POSITIVE LOGITS
     mid
    0.21
    mid
    0.19
    Mid
    0.17
     Mid
    0.17
    issy
    0.16
     Lair
    0.16
    allet
    0.15
    _mid
    0.15
    achers
    0.15
    à¹Ĥà¸ŀ
    0.15
    Act Density 0.019%

    No Known Activations