INDEX
    Explanations

    personal pronouns reflecting individual or group identity

    New Auto-Interp
    Negative Logits
    expandindo
    -1.13
     kasarigan
    -1.06
     متعلقه
    -0.88
    Hentet
    -0.83
    featureID
    -0.82
    InjectAttribute
    -0.81
     resourceCulture
    -0.80
     ligiloj
    -0.79
    afficheront
    -0.78
    RegressionTest
    -0.76
    POSITIVE LOGITS
    He
    0.67
     He
    0.65
    We
    0.59
    he
    0.55
    Our
    0.55
     We
    0.51
    ोंने
    0.50
     Our
    0.49
     he
    0.49
    She
    0.48
    Act Density 0.404%

    No Known Activations