INDEX
    Explanations

    text related to explanations, justifications, or the underlying logic behind actions or decisions

    phrases related to reasoning and justification

    New Auto-Interp
    Negative Logits
    adal
    -0.72
    vette
    -0.68
    onies
    -0.67
    semble
    -0.67
    uckle
    -0.66
     stocking
    -0.65
    hold
    -0.65
     national
    -0.64
    borg
    -0.64
    vas
    -0.63
    POSITIVE LOGITS
     reasoning
    1.19
     rationale
    1.01
    DragonMagazine
    0.96
     why
    0.95
    SourceFile
    0.95
     argument
    0.88
     justification
    0.84
     arguments
    0.80
     excuse
    0.79
     WHY
    0.79
    Act Density 0.007%

    No Known Activations