INDEX
    Explanations

    words related to physical thinness or slimness

    occurrences of the word "thin" and its variations

    New Auto-Interp
    Negative Logits
    bucks
    -0.77
    ktop
    -0.71
    another
    -0.65
    perty
    -0.61
    dam
    -0.60
     Admir
    -0.59
    ontent
    -0.59
    0100
    -0.59
    dad
    -0.59
    HI
    -0.58
    POSITIVE LOGITS
    ned
    1.57
    ning
    1.48
    ners
    1.26
    ening
    1.07
    ened
    0.96
    layer
    0.95
    nery
    0.94
    nesses
    0.92
     slices
    0.89
    ness
    0.87
    Act Density 0.059%

    No Known Activations