INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Stephanie
    -0.07
     Jonathan
    -0.06
    .Position
    -0.06
     Dave
    -0.06
    372
    -0.06
    Dave
    -0.06
    Mark
    -0.06
     Phill
    -0.06
     Steve
    -0.06
     Links
    -0.06
    POSITIVE LOGITS
    conscious
    0.07
    0.07
    -create
    0.07
    0.06
    _gt
    0.06
    Ар
    0.06
    _WITH
    0.06
    omatic
    0.06
    řád
    0.06
    yt
    0.06
    Act Density 0.015%

    No Known Activations