INDEX
    Explanations

    metadata and social media mentions, specifically items enclosed in brackets

    various types of brackets or symbols often involving lists or updates

    New Auto-Interp
    Negative Logits
     poisoning
    -0.78
     converted
    -0.74
     equivalents
    -0.74
     appropri
    -0.74
     transported
    -0.71
     continuity
    -0.71
     setting
    -0.71
     virginity
    -0.71
     values
    -0.70
     allowance
    -0.70
    POSITIVE LOGITS
    â̦]
    1.56
    ...]
    1.51
    np
    1.27
    Native
    1.23
    youtube
    1.23
    email
    1.21
    Laughs
    1.20
    !]
    1.19
    UPDATE
    1.19
    Edited
    1.19
    Act Density 0.025%

    No Known Activations