INDEX
    Explanations

    punctuation marks and formatting symbols used in written text

    list separators and parentheses

    New Auto-Interp
    Negative Logits
     interven
    -0.49
     Intervention
    -0.47
     intervention
    -0.43
    NSYLVANIA
    -0.43
    yntaxException
    -0.42
     disturb
    -0.41
    ADE
    -0.41
    RTEE
    -0.41
    casian
    -0.41
    trường
    -0.41
    POSITIVE LOGITS
    BeginContext
    0.53
     мәкал
    0.46
     BorderRadius
    0.44
     Catawiki
    0.43
    RTLR
    0.43
    0.40
     beginnetje
    0.40
     transfieras
    0.39
    قایناقلار
    0.39
    outSlope
    0.39
    Act Density 0.132%

    No Known Activations