INDEX
    Explanations

    occurrences of the word "helpful" in the text

    expressions related to helpfulness and assistance

    New Auto-Interp
    Negative Logits
    metal
    -0.75
    gow
    -0.75
    buck
    -0.75
    thur
    -0.74
    Vision
    -0.70
    Rush
    -0.69
     Bam
    -0.69
    pool
    -0.68
    inction
    -0.67
    scar
    -0.67
    POSITIVE LOGITS
     helpful
    0.98
     undermin
    0.87
     conduc
    0.78
     guiActiveUn
    0.77
     useful
    0.76
     aide
    0.75
    ãĤĭ
    0.75
     behavi
    0.75
     Helpful
    0.74
    ãĥ¼ãĥĨ
    0.73
    Act Density 0.007%

    No Known Activations