INDEX
    Explanations

    discussions about understanding and recognizing the distinction between reality and constructed concepts

    New Auto-Interp
    Negative Logits
    ixer
    -0.15
    ÎŃαÏĤ
    -0.15
     ÙĪØ§Ø¨
    -0.14
    Ñĸдом
    -0.14
    osit
    -0.14
     Zahl
    -0.14
    intColor
    -0.13
    olin
    -0.13
    ILT
    -0.13
    زÙĬز
    -0.13
    POSITIVE LOGITS
     Curtain
    0.14
     underst
    0.14
    408
    0.14
    ema
    0.14
    rieg
    0.14
    Poss
    0.14
    ansson
    0.13
    ØŃ
    0.13
    ä¸ĭ
    0.13
    çĩ
    0.13
    Act Density 1.931%

    No Known Activations