INDEX
    Explanations

    mentions of "Hell" and related contexts

    New Auto-Interp
    Negative Logits
    ibus
    -0.16
    noon
    -0.15
    okie
    -0.15
    Roz
    -0.14
    ellig
    -0.14
    etime
    -0.14
    ovich
    -0.14
    eway
    -0.14
    iliz
    -0.14
    Ĭ
    -0.14
    POSITIVE LOGITS
    fire
    0.22
    acious
    0.21
    zap
    0.20
    inois
    0.19
    iday
    0.18
    BOUND
    0.18
    raising
    0.17
    ipt
    0.17
    bound
    0.17
     hath
    0.17
    Act Density 0.011%

    No Known Activations