INDEX
    Explanations

    references to milk and dairy products

    New Auto-Interp
    Negative Logits
    ihar
    -0.17
    iÅŁ
    -0.17
    ucher
    -0.15
    reffen
    -0.14
    sembles
    -0.14
     libertine
    -0.14
     Ville
    -0.14
     Danh
    -0.14
    anie
    -0.14
    aign
    -0.13
    POSITIVE LOGITS
    shake
    0.44
    maid
    0.37
    maids
    0.33
    weed
    0.32
    sh
    0.28
    shed
    0.25
    fat
    0.25
    man
    0.23
    ier
    0.22
     repl
    0.22
    Act Density 0.010%

    No Known Activations