INDEX
    Explanations

    expressions related to fear

    New Auto-Interp
    Negative Logits
    arb
    -0.91
    urgy
    -0.84
    available
    -0.79
    arbon
    -0.76
    authors
    -0.75
    properties
    -0.74
    arkable
    -0.73
    added
    -0.71
    sample
    -0.70
    options
    -0.70
    POSITIVE LOGITS
    lessly
    1.22
     fear
    1.02
    mong
    0.98
    lessness
    0.96
     fears
    0.96
    ingly
    0.94
     lest
    0.93
    fully
    0.92
     afraid
    0.88
    fulness
    0.87
    Act Density 0.014%

    No Known Activations