INDEX
    Explanations

    terms related to controversial topics or societal issues, particularly ones involving laws, regulations, and societal debates

    New Auto-Interp
    Negative Logits
     Sorceress
    -0.64
     Museum
    -0.64
     Raider
    -0.63
     Ll
    -0.60
     Lair
    -0.59
     logo
    -0.59
     LORD
    -0.59
     Dragonbound
    -0.57
     recorder
    -0.57
     Library
    -0.57
    POSITIVE LOGITS
    reating
    1.04
    ipping
    0.99
    anging
    0.95
    ogging
    0.95
    ailing
    0.94
    itting
    0.94
    isting
    0.93
    ashing
    0.92
    inging
    0.92
    aring
    0.92
    Act Density 0.449%

    No Known Activations