INDEX
    Explanations

    emphatic words and phrases indicating necessity or importance

    strong recommendations or requirements

    New Auto-Interp
    Negative Logits
     للمعارف
    -0.64
     برانيه
    -0.61
    GOTREF
    -0.59
    -0.56
    AutoScale
    -0.56
    addContainerGap
    -0.56
     surla
    -0.56
     ModelExpression
    -0.53
    onoi
    -0.52
    KommentareTeilen
    -0.52
    POSITIVE LOGITS
     MUST
    0.46
    絶対に
    0.43
     must
    0.43
     safety
    0.42
     SAFETY
    0.40
    !!!
    0.40
     kesin
    0.39
    MUST
    0.38
     absolutely
    0.38
    sizePolicy
    0.38
    Act Density 0.019%

    No Known Activations