INDEX
    Explanations

    mentions of Islam and related terminology

    New Auto-Interp
    Negative Logits
    manship
    -0.17
    umen
    -0.17
    ake
    -0.16
    ango
    -0.15
    aman
    -0.15
    /kernel
    -0.15
    째
    -0.14
    age
    -0.14
    ui
    -0.14
     Sen
    -0.14
    POSITIVE LOGITS
     +:+
    0.17
    Č↵
    0.16
    .scalablytyped
    0.16
    ->___
    0.15
    Binder
    0.15
    abyrinth
    0.15
    .synthetic
    0.15
    (SIG
    0.14
    afen
    0.14
     Binder
    0.14
    Act Density 0.020%

    No Known Activations