INDEX
    Explanations

    sentences that include the word "said," indicating the presence of quotes or speech

    New Auto-Interp
    Negative Logits
    åĥıæĺ¯
    -0.15
     Express
    -0.14
    lean
    -0.14
     Shan
    -0.14
    ynet
    -0.13
     поÑĩ
    -0.13
    ville
    -0.13
    ags
    -0.13
     Julian
    -0.13
     signal
    -0.13
    POSITIVE LOGITS
    agli
    0.17
    arde
    0.17
     Aires
    0.15
    λεκ
    0.14
     reluct
    0.14
    arken
    0.14
     γεÏģ
    0.14
    eyh
    0.14
    zk
    0.14
    ailer
    0.14
    Act Density 0.061%

    No Known Activations