INDEX
    Explanations

    content related to religious themes or mentions

    New Auto-Interp
    Negative Logits
    piece
    -0.16
    \Active
    -0.16
    -piece
    -0.15
    ialized
    -0.15
     Uran
    -0.14
    stanov
    -0.14
    ssc
    -0.13
    finity
    -0.13
    assage
    -0.13
    Ïģιά
    -0.13
    POSITIVE LOGITS
    ones
    0.18
    ly
    0.17
    LY
    0.16
    vido
    0.15
    IPS
    0.15
     EH
    0.15
    489
    0.15
    auga
    0.14
    ÏĦεÏħ
    0.14
    ONES
    0.14
    Act Density 0.008%

    No Known Activations