INDEX
    Explanations

    proper nouns, specifically names of individuals

    New Auto-Interp
    Negative Logits
    ï¸ı
    -0.77
     Compass
    -0.68
    ĸļ
    -0.68
    UGE
    -0.67
     Totem
    -0.64
    netflix
    -0.63
     Orient
    -0.62
    usercontent
    -0.60
     Esk
    -0.59
    lvl
    -0.58
    POSITIVE LOGITS
    elli
    0.70
    kov
    0.69
    wagen
    0.68
    amins
    0.65
    ovic
    0.64
    Å¡
    0.64
    igi
    0.62
    ucci
    0.60
    hof
    0.59
    ello
    0.58
    Act Density 0.104%

    No Known Activations