INDEX
    Explanations

    phrases related to self-reflection and introspection

    New Auto-Interp
    Negative Logits
    onal
    -0.72
    heny
    -0.72
    rought
    -0.67
    olid
    -0.65
     Federation
    -0.62
    emis
    -0.61
    cru
    -0.61
     Mub
    -0.59
     roundup
    -0.59
    microsoft
    -0.58
    POSITIVE LOGITS
    é¾įåĸļ士
    1.05
    selves
    0.87
    ternally
    0.77
    çīĪ
    0.72
    imei
    0.70
    Redd
    0.70
    ortium
    0.69
    ä»
    0.69
    DragonMagazine
    0.68
    ãģ¯
    0.68
    Act Density 0.961%

    No Known Activations