INDEX
    Explanations

    mentions of fictional works and their relationship to reality or true stories

    New Auto-Interp
    Negative Logits
    ickets
    -0.15
    anela
    -0.15
     Jury
    -0.15
    zbollah
    -0.14
    antly
    -0.14
    letcher
    -0.14
     wet
    -0.14
    İY
    -0.14
    оÑģоб
    -0.14
     gut
    -0.13
    POSITIVE LOGITS
    bens
    0.16
    vere
    0.16
    appy
    0.15
    eler
    0.14
    urge
    0.14
    datatype
    0.14
    inta
    0.14
    dra
    0.14
     Zug
    0.14
    MRI
    0.13
    Act Density 0.180%

    No Known Activations