INDEX
    Explanations

    words related to health and a variety of effects or conditions

    New Auto-Interp
    Negative Logits
    -runner
    -0.18
    lying
    -0.15
    letcher
    -0.14
    öh
    -0.14
    ancell
    -0.14
    üssen
    -0.14
     corros
    -0.14
    jeme
    -0.14
     nackte
    -0.14
    loon
    -0.14
    POSITIVE LOGITS
    ellschaft
    0.27
     Ges
    0.19
    ellig
    0.18
     Ellen
    0.17
    und
    0.17
    ichte
    0.17
    ells
    0.17
    ell
    0.17
    ocks
    0.15
     rack
    0.15
    Act Density 0.010%

    No Known Activations