INDEX
    Explanations

    quantitative measurements and data

    New Auto-Interp
    Negative Logits
     Dem
    -0.17
    esty
    -0.15
    udiant
    -0.14
     Fate
    -0.14
    nes
    -0.14
    IFO
    -0.14
    vail
    -0.14
     sene
    -0.14
    oyer
    -0.14
    ujet
    -0.14
    POSITIVE LOGITS
    olit
    0.15
    ppers
    0.14
    integral
    0.14
    iman
    0.14
    unk
    0.13
    gens
    0.13
    ARIO
    0.13
    æĺ¨
    0.13
     Firearms
    0.13
    autiful
    0.13
    Act Density 0.126%

    No Known Activations