INDEX
    Explanations

    references to development, particularly in various contexts such as social, economic, or technological progress

    New Auto-Interp
    Negative Logits
    iff
    -0.16
     lẽ
    -0.15
    er
    -0.15
    KER
    -0.14
    ball
    -0.14
    dre
    -0.14
    indsay
    -0.14
    nia
    -0.14
    fires
    -0.14
     Kou
    -0.14
    POSITIVE LOGITS
    PMENT
    0.23
    ped
    0.22
    ally
    0.18
    票
    0.17
    ement
    0.16
    lis
    0.16
    mental
    0.16
    led
    0.15
    ocoder
    0.15
    dish
    0.15
    Act Density 0.075%

    No Known Activations