INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Económica
    -0.88
     chré
    -0.84
    featureID
    -0.82
     sauvages
    -0.82
     étoient
    -0.79
     étoit
    -0.77
     giustizia
    -0.77
     ainfi
    -0.77
     commerciaux
    -0.77
     ejus
    -0.76
    POSITIVE LOGITS
    sp
    0.60
    fac
    0.52
     inner
    0.50
    r
    0.47
    SP
    0.47
    if
    0.47
    inos
    0.45
    not
    0.45
    bot
    0.45
    ase
    0.44
    Act Density 0.069%

    No Known Activations