INDEX
    Explanations

    sentences containing explanations or descriptions

    instances of explaining situations or concepts

    New Auto-Interp
    Negative Logits
    thood
    -0.73
    iaries
    -0.72
    luster
    -0.70
    mage
    -0.70
    oso
    -0.68
    ngth
    -0.67
    essee
    -0.65
    tackle
    -0.64
    elight
    -0.62
    isha
    -0.62
    POSITIVE LOGITS
     rationale
    1.14
     reasoning
    1.05
     why
    0.92
    why
    0.91
     concepts
    0.90
     virtues
    0.87
     principles
    0.87
     criteria
    0.85
     workings
    0.85
     WHY
    0.85
    Act Density 0.267%

    No Known Activations