INDEX
    Explanations

    mentions of Christianity and related terms

    New Auto-Interp
    Negative Logits
     terecht
    -0.44
     vectorielle
    -0.44
    -0.44
    stateParams
    -0.43
    pulseira
    -0.42
    RegressionTest
    -0.42
    katakan
    -0.42
     Efq
    -0.41
    fromnode
    -0.41
    olulu
    -0.41
    POSITIVE LOGITS
     Dior
    0.52
    Dior
    0.42
     Bale
    0.42
     Slater
    0.39
     dior
    0.38
    iddhar
    0.37
    0.36
    สือ
    0.36
     saites
    0.36
    -------
    0.35
    Act Density 0.204%

    No Known Activations