INDEX
    Explanations

    references to general plural nouns indicating people or groups

    New Auto-Interp
    Negative Logits
    ãģĦãĤĭ
    -0.30
    ãģĦãģŁ
    -0.28
    ————————————————
    -0.21
    ت
    -0.21
    ————————
    -0.20
    ated
    -0.19
    ————
    -0.18
    ample
    -0.18
    ะ
    -0.17
    aphore
    -0.17
    POSITIVE LOGITS
    ร
    0.31
    न
    0.27
    ìĦľëĬĶ
    0.25
    iy
    0.24
    ìĦľ
    0.20
    iw
    0.20
    ment
    0.19
    Ùħ
    0.19
    ãģªãģĦ
    0.18
    acon
    0.18
    Act Density 0.448%

    No Known Activations