INDEX
    Explanations

    words related to mental states or activities related to awareness and perception

    references to consciousness

    New Auto-Interp
    Negative Logits
     PM
    -0.70
     Schne
    -0.68
    ctic
    -0.67
     rough
    -0.66
     Naz
    -0.65
    GER
    -0.63
    rug
    -0.62
     Rough
    -0.61
     unpublished
    -0.61
    bor
    -0.59
    POSITIVE LOGITS
     consciousness
    1.17
     Conscious
    1.04
     conscious
    0.98
    jriwal
    0.98
     awareness
    0.94
    ynes
    0.91
    edly
    0.91
    oenix
    0.86
    ibility
    0.86
    ysis
    0.85
    Act Density 0.009%

    No Known Activations