INDEX
    Explanations

    concepts related to power and authority

    New Auto-Interp
    Negative Logits
     Brunswick
    -0.15
    onda
    -0.15
    beros
    -0.15
     gig
    -0.15
    .sb
    -0.15
    ahas
    -0.14
    ucci
    -0.14
    *scale
    -0.14
     nuest
    -0.14
    ovich
    -0.14
    POSITIVE LOGITS
     Barbar
    0.15
    jn
    0.15
     ê
    0.14
     Hind
    0.14
     Å
    0.13
    111
    0.13
     [[[
    0.13
    vant
    0.13
     hind
    0.13
    xbd
    0.13
    Act Density 0.013%

    No Known Activations