INDEX
    Explanations

    phrases focused on identity and relationships

    New Auto-Interp
    Negative Logits
    oleÄį
    -0.15
    kowski
    -0.15
    Ãło
    -0.14
    _registry
    -0.14
    agle
    -0.14
    klady
    -0.14
     Eh
    -0.14
    à¸Ńà¸ģ
    -0.13
    ×ķ
    -0.13
    zman
    -0.13
    POSITIVE LOGITS
    itom
    0.16
    rist
    0.15
    chu
    0.14
    pcs
    0.14
     Fletcher
    0.14
    legen
    0.14
     Rak
    0.14
     Rick
    0.14
    INO
    0.13
     capture
    0.13
    Act Density 0.043%

    No Known Activations