INDEX
    Explanations

    positive adjectives describing qualities and characteristics

    New Auto-Interp
    Negative Logits
    sdale
    -0.15
    recision
    -0.14
    roker
    -0.14
    .vn
    -0.14
    forman
    -0.13
    æŀģ
    -0.13
    ismu
    -0.13
    multiline
    -0.13
    isex
    -0.13
    omination
    -0.13
    POSITIVE LOGITS
     amount
    0.30
     sense
    0.26
     understanding
    0.25
     grasp
    0.25
     following
    0.25
    amount
    0.25
     degree
    0.24
     handle
    0.24
     level
    0.23
     Amount
    0.22
    Act Density 0.162%

    No Known Activations