INDEX
    Explanations

    names and surnames of individuals, particularly focusing on those with notable recognition or context in various situations

    New Auto-Interp
    Negative Logits
    <bos>
    -1.40
     intersper
    -0.84
    ethene
    -0.69
    Referencie
    -0.69
    ötä
    -0.68
    Více
    -0.65
     apprehen
    -0.65
     quitted
    -0.64
    -0.63
     trod
    -0.62
    POSITIVE LOGITS
     dy
    1.02
     Wy
    1.02
     Dy
    0.99
    Wy
    0.97
     Ry
    0.96
     RY
    0.96
     DY
    0.94
     Gy
    0.93
     Dys
    0.93
    dy
    0.92
    Act Density 0.206%

    No Known Activations