INDEX
    Explanations

    references to religious figures or deities

    New Auto-Interp
    Negative Logits
    omu
    -0.17
    yle
    -0.17
    ERO
    -0.16
    cona
    -0.16
    reds
    -0.15
    ÑĢой
    -0.15
     Rub
    -0.14
    ENC
    -0.14
    ero
    -0.14
    ouro
    -0.14
    POSITIVE LOGITS
     another
    0.21
    another
    0.18
     Another
    0.15
     Bloss
    0.14
     Dress
    0.14
    åı¦
    0.14
    füg
    0.14
    Another
    0.14
    eki
    0.14
     outra
    0.14
    Act Density 0.000%

    No Known Activations