INDEX
    Explanations

    words in a non-Latin script, possibly Cyrillic

    characters or symbols, potentially indicating special or foreign textual elements

    New Auto-Interp
    Negative Logits
     manif
    -0.67
     dolls
    -0.65
     Riley
    -0.65
     conduc
    -0.64
     temptation
    -0.64
     visitation
    -0.64
    ktop
    -0.64
    bourg
    -0.64
    enegger
    -0.61
     caravan
    -0.61
    POSITIVE LOGITS
    оÐ
    1.08
    е
    1.04
    ÑĢ
    1.04
    ¬
    1.04
    Į
    1.03
    °
    1.02
    Ĺ
    1.00
    ĺ
    1.00
    Ñĥ
    0.99
    ¹
    0.99
    Act Density 0.051%

    No Known Activations