INDEX
    Explanations

    names of people and places

    New Auto-Interp
    Negative Logits
    dyn
    -0.17
    svp
    -0.16
    reon
    -0.16
    IDAD
    -0.16
    unner
    -0.16
     vict
    -0.15
    anax
    -0.15
    atk
    -0.15
    OnError
    -0.15
    irse
    -0.15
    POSITIVE LOGITS
    ò
    0.28
    ì
    0.26
    azz
    0.23
    acc
    0.21
    òn
    0.20
    è
    0.20
    aggio
    0.20
    ini
    0.20
    igli
    0.20
    izz
    0.19
    Act Density 0.089%

    No Known Activations