INDEX
    Explanations

    instances of the word "conscious" and its variations

    New Auto-Interp
    Negative Logits
    .assets
    -0.17
     Hamm
    -0.16
    marshall
    -0.15
    ool
    -0.15
    code
    -0.15
    oppel
    -0.14
    loff
    -0.14
    flo
    -0.14
    oll
    -0.14
    motion
    -0.14
    POSITIVE LOGITS
    ipt
    0.19
     Nach
    0.16
    yi
    0.15
    죽
    0.15
    .HandlerFunc
    0.15
     bols
    0.14
    iba
    0.14
    ë°©ìĨ¡
    0.14
    än
    0.13
    vg
    0.13
    Act Density 0.004%

    No Known Activations