INDEX
    Explanations

    words related to programming or setting particular behaviors and attributes

    terms related to programming and conditioning in a biological or artificial context

    New Auto-Interp
    Negative Logits
     Cheong
    -0.81
     Sack
    -0.79
    fal
    -0.77
    Scotland
    -0.76
    apest
    -0.69
    apers
    -0.65
    iversary
    -0.65
    asta
    -0.64
     umbrella
    -0.63
     Hague
    -0.63
    POSITIVE LOGITS
    eering
    0.88
     Reloaded
    0.82
     instincts
    0.81
    strap
    0.77
    washed
    0.75
    eer
    0.74
     bred
    0.72
    ependent
    0.70
     instinct
    0.70
     algorithms
    0.68
    Act Density 0.139%

    No Known Activations