INDEX
    Explanations

    instances of specific names, particularly those of female characters or notable women

    New Auto-Interp
    Negative Logits
    itas
    -0.16
    duino
    -0.15
    ãĥŃãĥ³
    -0.15
     Bernstein
    -0.15
    tach
    -0.15
     Sınıf
    -0.15
    built
    -0.14
    abort
    -0.14
    æ²»
    -0.14
    kup
    -0.14
    POSITIVE LOGITS
    duct
    0.16
    -medium
    0.16
     Pent
    0.15
    ductor
    0.14
    ần
    0.14
    ENTE
    0.14
     lược
    0.14
    ente
    0.14
     Toe
    0.14
    iska
    0.14
    Act Density 0.013%

    No Known Activations