INDEX
    Explanations

    names or words related to names, specifically those ending in "am"

    New Auto-Interp
    Negative Logits
     resemb
    -0.68
     stakes
    -0.68
     kinderg
    -0.67
     occupants
    -0.66
     masks
    -0.65
     steroids
    -0.64
     dimensions
    -0.64
     corners
    -0.63
     edges
    -0.62
    ¥ŀ
    -0.62
    POSITIVE LOGITS
    nesty
    1.32
    endment
    1.25
    sterdam
    1.18
    borgh
    1.12
    bitious
    1.11
    ilies
    1.11
    essage
    1.09
    urai
    1.09
    ateur
    1.03
    ulet
    1.02
    Act Density 0.023%

    No Known Activations