INDEX
    Explanations

    punctuation marks and indicators of inclusion or lists

    Text following colons

    colon followed by one or if

    New Auto-Interp
    Negative Logits
    出版年
    -0.79
     Савезне
    -0.77
    <bos>
    -0.71
    ngrx
    -0.66
     мәкалә
    -0.64
     للمعارف
    -0.61
    GHIJKLM
    -0.60
     Wikimedijinoj
    -0.60
    NSNotification
    -0.58
     estimés
    -0.58
    POSITIVE LOGITS
    ↵↵
    1.88
    0.95
    ↵↵↵
    0.89
    ↵↵↵↵
    0.83
    ():
    0.79
    <eos>
    0.76
    ↵↵↵↵↵
    0.72
    ↵↵↵↵↵↵↵
    0.64
    ↵↵↵↵↵↵
    0.62
    :-
    0.61
    Act Density 0.211%

    No Known Activations