INDEX
    Explanations

    phrases emphasizing significant quantities or strengths

    New Auto-Interp
    Negative Logits
    ysc
    -0.74
    uay
    -0.72
    rick
    -0.70
     Origins
    -0.70
    flies
    -0.70
     Annotations
    -0.68
    runs
    -0.65
    \/\/
    -0.64
    np
    -0.64
    \<
    -0.64
    POSITIVE LOGITS
     thing
    1.23
     scenario
    1.04
     situation
    0.95
     feat
    0.90
     delicate
    0.89
     drastic
    0.87
     huge
    0.86
     sensitive
    0.86
     hypothetical
    0.85
     possibility
    0.82
    Act Density 0.026%

    No Known Activations