INDEX
    Explanations

    contrasting descriptions of people or events

    New Auto-Interp
    Negative Logits
    ï¸
    -0.15
    able
    -0.14
    kinson
    -0.14
    ible
    -0.14
    versible
    -0.14
    elah
    -0.14
    oken
    -0.14
    _FIXED
    -0.14
    ville
    -0.13
     Harm
    -0.13
    POSITIVE LOGITS
     yet
    0.21
     occasionally
    0.21
    yet
    0.17
    .ActionListener
    0.17
     sometimes
    0.17
     ÑĩаÑģом
    0.15
     prof
    0.15
    ometimes
    0.15
     moving
    0.15
    beautiful
    0.15
    Act Density 0.114%

    No Known Activations