INDEX
    Explanations

    phrases or words related to a specific individual or entity, likely named "Der" with varying activations

    references to the term "Der" in various contexts

    New Auto-Interp
    Negative Logits
     taco
    -0.72
     canoe
    -0.66
    TPS
    -0.65
    oka
    -0.65
    ãĥīãĥ©
    -0.64
    box
    -0.64
    ogg
    -0.63
    poon
    -0.63
     ping
    -0.62
     omn
    -0.62
    POSITIVE LOGITS
     Der
    4.05
    Der
    3.16
     der
    1.65
    der
    1.38
     Derby
    1.29
     dermat
    1.29
     derivatives
    1.25
     deriv
    1.06
     Die
    1.05
     derivative
    1.04
    Act Density 0.019%

    No Known Activations