INDEX
    Explanations

    actions related to preventing or thwarting negative outcomes

    preventing or thwarting actions

    New Auto-Interp
    Negative Logits
    endregion
    -0.54
    出版年
    -0.52
    intios
    -0.47
     kasarigan
    -0.42
    Salta
    -0.40
    Slf
    -0.40
    Tikang
    -0.40
    dataclass
    -0.40
    ngilizce
    -0.40
     transfieras
    -0.39
    POSITIVE LOGITS
     prevented
    0.65
     verhindert
    0.65
     voorkomen
    0.63
    prevent
    0.59
    阻止
    0.56
     preventing
    0.56
     prevent
    0.56
     averted
    0.56
     Prevent
    0.55
     avert
    0.54
    Act Density 0.035%

    No Known Activations