INDEX
    Explanations

    words that signal causality, reasons, or logical arguments

    New Auto-Interp
    Negative Logits
    Hauptartikel
    -0.65
     cref
    -0.58
    WriteAttribute
    -0.57
    AndEndTag
    -0.57
     EconPapers
    -0.56
    VersionUID
    -0.54
    Leírás
    -0.50
    öglichkeiten
    -0.49
    ItemBackground
    -0.48
    ?>">
    -0.48
    POSITIVE LOGITS
     not
    2.70
    not
    2.31
    Not
    2.17
     Not
    2.06
    NOT
    1.84
     NOT
    1.83
     nicht
    1.31
     niet
    1.18
     не
    1.16
     ikke
    1.12
    Act Density 1.743%

    No Known Activations