INDEX
    Explanations

    phrases related to community guidelines and respectful communication

    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.11
    2:0.03
    3:0.11
    4:0.07
    5:0.14
    6:0.08
    7:0.06
    8:0.04
    9:0.13
    10:0.07
    11:0.03
    Negative Logits
    ��
    -2.67
     acron
    -2.36
     Scheme
    -2.34
    Tile
    -2.31
     Agric
    -2.26
     Solitaire
    -2.25
     IRC
    -2.17
     Slime
    -2.14
     Secondly
    -2.14
     Templ
    -2.13
    POSITIVE LOGITS
    bott
    2.42
    ils
    2.39
    Gab
    2.35
    ohan
    2.33
    dy
    2.29
     dent
    2.23
     Gab
    2.18
     Vulkan
    2.18
     Dol
    2.17
    ordan
    2.16
    Act Density 0.002%

    No Known Activations