INDEX
    Explanations

    phrases indicating the significance of an issue or situation, particularly emphasizing whether it is a "big deal" or not

    New Auto-Interp
    Negative Logits
     nonUne
    -0.60
     surla
    -0.59
     MaterialApp
    -0.58
    KommentareTeilen
    -0.56
    Kjelder
    -0.55
    ########.
    -0.55
     ModelExpression
    -0.54
    Autoritní
    -0.52
     createSlice
    -0.52
     Comprometido
    -0.51
    POSITIVE LOGITS
     harmless
    0.63
     innoc
    0.41
     nothing
    0.40
     insignificant
    0.40
    没事
    0.39
     innocently
    0.38
     problemlos
    0.38
     Nothing
    0.37
     Miscellaneous
    0.37
    没什么
    0.37
    Act Density 0.055%

    No Known Activations