INDEX
    Explanations

    phrases related to telling or informing someone about something

    pronouns and the presence of personal references in sentences

    New Auto-Interp
    Negative Logits
     Wikimedia
    -0.70
    ource
    -0.65
    eland
    -0.65
    hement
    -0.63
     edges
    -0.62
    BuyableInstoreAndOnline
    -0.62
    nel
    -0.60
    ument
    -0.58
    adel
    -0.58
    malink
    -0.57
    POSITIVE LOGITS
    Filename
    0.75
     goodbye
    0.75
     bluff
    0.70
    psc
    0.70
     "#
    0.68
     '[
    0.68
    =\"
    0.67
    asta
    0.66
     "'
    0.63
     "\
    0.62
    Act Density 0.233%

    No Known Activations