INDEX
    Explanations

    words related to specific individuals or identifiers

    New Auto-Interp
    Negative Logits
    ixa
    -0.16
    ixer
    -0.16
    lds
    -0.15
    ousse
    -0.15
    ij
    -0.15
    lep
    -0.15
    ÑĥÑĪки
    -0.14
    IE
    -0.14
    ãģĹãģı
    -0.14
    Ñĥж
    -0.14
    POSITIVE LOGITS
     above
    0.20
     Above
    0.18
    above
    0.18
    Above
    0.18
     ABOVE
    0.16
    Unified
    0.15
    rror
    0.15
     BeÅŁ
    0.15
    ÑĤоÑĢ
    0.15
     foregoing
    0.14
    Act Density 0.142%

    No Known Activations