INDEX
    Explanations

    the beginning of a document or text, indicating the start of a significant section

    New Auto-Interp
    Negative Logits
    titleMargin
    -1.16
    featureID
    -1.07
    Personendaten
    -1.05
    webElementXpaths
    -1.04
    NameInMap
    -1.01
     Walkover
    -1.00
    parsedMessage
    -1.00
    ItemBackground
    -0.98
    Vidite
    -0.98
    kháu
    -0.98
    POSITIVE LOGITS
    гова
    0.43
    0.42
    [toxicity=0]
    0.39
     Smyth
    0.39
    ↵↵
    0.37
    #
    0.37
    0.37
     Dan
    0.36
     he
    0.36
    人都
    0.36
    Act Density 0.889%

    No Known Activations