INDEX
    Explanations

    concepts related to morality and consciousness

    New Auto-Interp
    Negative Logits
     partly
    -0.18
     lots
    -0.16
     people
    -0.15
     started
    -0.14
     everybody
    -0.14
    æĬĬ
    -0.14
     different
    -0.14
     à¹Ĩ
    -0.13
     clos
    -0.13
     using
    -0.13
    POSITIVE LOGITS
    ãģ«ãģ¦
    0.19
    upon
    0.15
     aforementioned
    0.14
    PostalCodes
    0.14
    Upon
    0.14
    ~-~-~-~-
    0.14
    ARGIN
    0.13
    sans
    0.13
    arth
    0.13
     Upon
    0.13
    Act Density 3.983%

    No Known Activations