INDEX
    Explanations

    words associated with adjectives and their various forms, particularly those indicating qualities, conditions, and characteristics

    New Auto-Interp
    Negative Logits
    ìĿĦ
    -0.21
    lo
    -0.20
    ses
    -0.20
    Ìĥ
    -0.19
    ers
    -0.19
    soever
    -0.19
    ings
    -0.19
    ma
    -0.19
    ra
    -0.18
       
    -0.18
    POSITIVE LOGITS
    -minded
    0.19
    y
    0.17
    ALLY
    0.17
    amente
    0.17
    ity
    0.16
    /select
    0.16
    yas
    0.16
    -looking
    0.16
    ourt
    0.15
    elyn
    0.15
    Act Density 0.143%

    No Known Activations