INDEX
    Explanations

    first name followed by surname

    New Auto-Interp
    Negative Logits
    Karen
    0.94
     Debra
    0.93
     Kathy
    0.91
    ahue
    0.90
    ত্যার
    0.89
    苏联
    0.89
    0.88
    0.88
     θε
    0.86
    Gloria
    0.86
    POSITIVE LOGITS
    监管
    0.73
     Minecraft
    0.68
     Instagram
    0.67
     đá
    0.67
    Guesses
    0.66
     b
    0.66
    パーツ
    0.66
     reddit
    0.65
     kontroll
    0.65
    Saharan
    0.65
    Act Density 0.001%

    No Known Activations