INDEX
    Explanations

    mentions of specific cultural or religious terms and significant figures

    New Auto-Interp
    Negative Logits
     Abram
    -0.19
    reau
    -0.16
     Till
    -0.15
    ffen
    -0.15
    intl
    -0.14
    ouro
    -0.14
     Ñĥк
    -0.14
    ·»
    -0.14
    atura
    -0.14
    abeth
    -0.14
    POSITIVE LOGITS
    RIX
    0.16
    _DROP
    0.16
    iej
    0.15
    monster
    0.14
    ewise
    0.14
    eting
    0.14
     pel
    0.14
    myfile
    0.14
    ÑĢож
    0.14
    undle
    0.14
    Act Density 0.023%

    No Known Activations