INDEX
    Explanations

    names and locations, particularly those involving sports

    the presence of specific gerunds or verb forms ending in 'ing'

    New Auto-Interp
    Negative Logits
    ihar
    -0.76
    nery
    -0.75
    ngth
    -0.73
    ably
    -0.69
    icles
    -0.68
    uba
    -0.67
    ophob
    -0.66
    igning
    -0.65
    iness
    -0.65
    idity
    -0.65
    POSITIVE LOGITS
    tons
    1.35
    ham
    1.17
    HAM
    1.13
    ton
    1.05
    redients
    0.97
     Stones
    0.96
     Sands
    0.93
    edge
    0.86
    lass
    0.85
    haus
    0.83
    Act Density 0.119%

    No Known Activations