INDEX
    Explanations

    names of notable actresses and their roles or appearances in various films and shows

    New Auto-Interp
    Negative Logits
     himself
    -0.23
     Himself
    -0.19
    sdale
    -0.16
    妻
    -0.16
     David
    -0.16
     handsome
    -0.15
    ady
    -0.14
     Teddy
    -0.14
    juan
    -0.14
    pollo
    -0.14
    POSITIVE LOGITS
     herself
    0.28
    олева
    0.18
    ová
    0.17
    athed
    0.17
     Ñģама
    0.16
     Dawn
    0.15
     latina
    0.15
     могла
    0.15
     Leigh
    0.15
     Ñģказала
    0.15
    Act Density 0.126%

    No Known Activations