INDEX
    Explanations

    names of individuals

    references to the word "lad" and related variations

    New Auto-Interp
    Negative Logits
    ELL
    -0.83
    kward
    -0.69
    plays
    -0.67
    forts
    -0.67
    EED
    -0.67
    basketball
    -0.66
     Bakr
    -0.64
     ticking
    -0.64
    ä¸Ń
    -0.63
    oldown
    -0.62
    POSITIVE LOGITS
    imir
    1.41
    der
    1.04
    isl
    0.93
    mir
    0.92
    ynam
    0.92
    itionally
    0.85
    itional
    0.82
    amus
    0.80
    ewater
    0.79
    rian
    0.78
    Act Density 0.036%

    No Known Activations