INDEX
    Explanations

    religious or moral terms

    words and phrases related to various forms of seriousness or severity

    New Auto-Interp
    Negative Logits
    avia
    -0.69
    door
    -0.69
    WH
    -0.66
    ©¶æ
    -0.66
    oan
    -0.62
    AW
    -0.62
    APS
    -0.62
    bsite
    -0.62
    aver
    -0.61
    ARC
    -0.60
    POSITIVE LOGITS
    ness
    1.47
    nesses
    1.28
    ity
    1.12
    ities
    0.90
    ly
    0.88
    NESS
    0.81
    Magikarp
    0.80
    ous
    0.77
    lihood
    0.76
    liness
    0.75
    Act Density 0.064%

    No Known Activations