INDEX
    Explanations

    names that likely belong to characters in a movie or a TV show

    New Auto-Interp
    Negative Logits
    <bos>
    -2.23
    guang
    -0.98
    qiao
    -0.84
    xiu
    -0.84
    qian
    -0.75
    huo
    -0.73
    HideFlags
    -0.72
    yao
    -0.71
    xun
    -0.71
    luo
    -0.70
    POSITIVE LOGITS
     soulign
    1.25
     véhic
    1.21
     fameux
    1.13
     accla
    1.13
     unspeak
    1.09
     dénon
    1.04
     eiffel
    1.03
     Mejía
    1.03
     zove
    1.02
     vété
    1.01
    Act Density 0.176%

    No Known Activations