INDEX
    Explanations

    references to food and dining experiences

    New Auto-Interp
    Negative Logits
    fal
    -0.17
    rub
    -0.16
    cona
    -0.15
    gradable
    -0.14
     fal
    -0.14
     Cage
    -0.14
    peria
    -0.14
     jenter
    -0.14
     rub
    -0.14
    icket
    -0.14
    POSITIVE LOGITS
     liver
    0.23
    Liver
    0.20
     Liver
    0.20
     kidney
    0.19
     Spam
    0.19
     cust
    0.18
     kidneys
    0.18
     hashed
    0.17
     spam
    0.17
     stew
    0.16
    Act Density 0.089%

    No Known Activations