INDEX
    Explanations

    personal pronouns and possessive pronouns

    New Auto-Interp
    Negative Logits
     attes
    -0.67
     relativi
    -0.63
     hoj
    -0.59
     Février
    -0.59
     liev
    -0.57
     incess
    -0.57
     quí
    -0.56
     Novembre
    -0.55
     suscit
    -0.55
     inverte
    -0.55
    POSITIVE LOGITS
     into
    0.84
     away
    0.76
     onto
    0.72
     back
    0.69
     toward
    0.62
     forward
    0.62
     towards
    0.61
     astray
    0.59
     ashore
    0.57
     down
    0.57
    Act Density 0.286%

    No Known Activations