INDEX
    Explanations

    proper nouns, including names of people and locations

    New Auto-Interp
    Negative Logits
    anca
    -0.17
    оби
    -0.16
    IGH
    -0.15
     поÑģÑĤÑĥп
    -0.15
    ıb
    -0.14
     ani
    -0.14
    asal
    -0.14
    loff
    -0.13
     Charge
    -0.13
    aban
    -0.13
    POSITIVE LOGITS
    enson
    0.15
     hon
    0.14
    419
    0.14
    642
    0.14
    369
    0.14
     spons
    0.14
     Orch
    0.14
    δÏģα
    0.13
    gis
    0.13
     Trades
    0.13
    Act Density 0.037%

    No Known Activations