INDEX
    Explanations

    references to religious or cultural figures and practices

    New Auto-Interp
    Negative Logits
     Eden
    -0.16
    elden
    -0.15
    Toolkit
    -0.14
    pike
    -0.14
     Monsters
    -0.14
    .tencent
    -0.13
    жÑĥ
    -0.13
    ismet
    -0.13
    Pie
    -0.13
    UCT
    -0.13
    POSITIVE LOGITS
     Ling
    0.27
     ling
    0.26
     temple
    0.26
     Lord
    0.25
     devote
    0.24
    Lord
    0.24
     poo
    0.23
     lord
    0.22
     temples
    0.22
     Temp
    0.21
    Act Density 0.184%

    No Known Activations