INDEX
    Explanations

    expressions related to admiration and respect for individuals

    positive descriptions and roles

    New Auto-Interp
    Negative Logits
     dise
    -0.45
    новниш
    -0.44
    хьтан
    -0.41
    pal
    -0.41
     zahl
    -0.40
     staging
    -0.39
     unsatisfactory
    -0.39
     pal
    -0.38
     Appel
    -0.38
     final
    -0.37
    POSITIVE LOGITS
    rungsseite
    0.55
     dignité
    0.48
     compétence
    0.48
     humanidade
    0.47
     sagesse
    0.47
    ۜ
    0.46
    testify
    0.45
     GenerationType
    0.45
     Infór
    0.44
     Memiliki
    0.44
    Act Density 0.024%

    No Known Activations