INDEX
    Explanations

    occurrences of the token "<bos>", signaling the beginning of new sections or paragraphs

    New Auto-Interp
    Negative Logits
    TagMode
    -0.64
    Rhestr
    -0.56
     Nestor
    -0.54
     itſelf
    -0.53
    🟤
    -0.52
     antidesliz
    -0.50
     goddesses
    -0.50
    yourself
    -0.49
    yves
    -0.49
    chargez
    -0.48
    POSITIVE LOGITS
    IsContent
    0.88
    Personendaten
    0.80
     bezeichneter
    0.78
    WebVitals
    0.70
     utafitiHapana
    0.69
    èdia
    0.67
    writeField
    0.65
     CanadaChoose
    0.63
    NameInMap
    0.62
     ویکی‌پدی
    0.62
    Act Density 0.913%

    No Known Activations