INDEX
    Explanations

    mentions of religious leaders, particularly rabbis

    New Auto-Interp
    Negative Logits
    ighton
    -0.17
    zet
    -0.15
    ktion
    -0.15
    ance
    -0.15
    оÑģÑĤÑĥп
    -0.15
    ANDARD
    -0.15
    leton
    -0.15
    atural
    -0.15
    ori
    -0.14
    ilers
    -0.14
    POSITIVE LOGITS
     rab
    0.20
     Rab
    0.20
    bin
    0.20
    rab
    0.17
    bits
    0.17
    bi
    0.16
    idity
    0.16
     rabbits
    0.16
    shake
    0.16
    BIT
    0.16
    Act Density 0.004%

    No Known Activations