INDEX
    Explanations

    specific names of individuals, likely related to certain professions or activities

    proper nouns, particularly names of people and places

    New Auto-Interp
    Negative Logits
    ggle
    -0.71
    uma
    -0.70
    pport
    -0.70
    ction
    -0.69
    fights
    -0.68
    matic
    -0.68
    fight
    -0.66
    enic
    -0.64
    xx
    -0.63
    umat
    -0.63
    POSITIVE LOGITS
    imore
    0.85
    éĹĺ
    0.81
    lees
    0.78
    imer
    0.75
    arson
    0.74
    Redditor
    0.74
    stown
    0.72
     fences
    0.71
    espie
    0.71
    inelli
    0.69
    Act Density 0.075%

    No Known Activations