INDEX
    Explanations

    phrases emphasizing the importance of details and subjective experiences

    New Auto-Interp
    Negative Logits
     anywhere
    -0.18
     nowhere
    -0.17
     neither
    -0.17
     not
    -0.16
    aktu
    -0.15
     overall
    -0.15
     alone
    -0.15
     both
    -0.15
     something
    -0.15
     gener
    -0.15
    POSITIVE LOGITS
     happening
    0.20
    _except
    0.20
    Except
    0.20
    iem
    0.18
    except
    0.18
     Except
    0.18
     Ú¯ÙģØªÙĩ
    0.17
    azen
    0.17
     happens
    0.16
     interconnected
    0.16
    Act Density 0.141%

    No Known Activations