INDEX
    Explanations

    the occurrence of the word "tw" and its variations, indicating a focus on social media references, particularly related to Twitter

    New Auto-Interp
    Negative Logits
    vetica
    -0.16
    ung
    -0.15
    تاب
    -0.15
    jav
    -0.15
    оваÑĢ
    -0.14
    ÙĦÙĬÙĩ
    -0.14
    UNG
    -0.14
    hyp
    -0.14
    ÑĤеÑĢи
    -0.14
    istrovstvÃŃ
    -0.14
    POSITIVE LOGITS
    viso
    0.20
    åĽ´
    0.17
    ór
    0.15
    ided
    0.15
    assi
    0.15
    nee
    0.14
    ìłĢ
    0.14
    etik
    0.14
    654
    0.13
    esor
    0.13
    Act Density 0.012%

    No Known Activations