INDEX
    Explanations

    proper nouns, particularly names and places

    New Auto-Interp
    Negative Logits
    ãĤ·ãĥ£
    -0.73
    Beck
    -0.72
    hardt
    -0.67
    adr
    -0.67
    IGH
    -0.67
    IFE
    -0.63
    arist
    -0.62
    omes
    -0.62
     sorts
    -0.61
    UFF
    -0.61
    POSITIVE LOGITS
    etsk
    0.95
     Pok
    0.93
    asus
    0.93
    oshenko
    0.88
    ongyang
    0.86
    atern
    0.83
    etry
    0.82
    ilon
    0.77
    gran
    0.77
    etary
    0.77
    Act Density 0.042%

    No Known Activations