INDEX
    Explanations

    sequences containing specific characters, potentially as part of code or other specialized text

    occurrences of a specific character or symbol, likely focusing on a particular character or motif throughout the document

    New Auto-Interp
    Negative Logits
    geries
    -0.94
     distracting
    -0.69
     distracted
    -0.69
    sworth
    -0.67
     responders
    -0.65
     regul
    -0.65
     foreground
    -0.64
     tee
    -0.63
    luent
    -0.63
    raints
    -0.62
    POSITIVE LOGITS
    и
    1.05
    ski
    0.92
    о
    0.92
    à¸
    0.89
    оÐ
    0.88
    Ñĥ
    0.87
    æŃ¦
    0.85
    âĸĦ
    0.85
    е
    0.84
    а
    0.84
    Act Density 0.008%

    No Known Activations