INDEX
    Explanations

    expressions related to discussions or reflections on various topics

    expressions of hesitation, caution, or shame regarding personal experiences or opinions

    New Auto-Interp
    Negative Logits
    ngth
    -0.80
    ynthesis
    -0.79
    strous
    -0.73
    vantage
    -0.71
    ictionary
    -0.66
    DragonMagazine
    -0.66
    anooga
    -0.65
    odder
    -0.65
    orld
    -0.64
    uction
    -0.64
    POSITIVE LOGITS
     how
    1.21
     what
    1.05
     admitting
    1.04
     acknowledging
    1.02
     whether
    1.01
     letting
    0.99
     where
    0.96
     choosing
    0.95
     disclosing
    0.94
     knowing
    0.94
    Act Density 0.187%

    No Known Activations