INDEX
    Explanations

    phrases related to reflections and experiences, particularly in governance and societal contexts

    New Auto-Interp
    Negative Logits
    heten
    -0.16
     Importance
    -0.14
     anymore
    -0.14
    enco
    -0.14
     respectively
    -0.14
     everywhere
    -0.14
    quia
    -0.14
     Conrad
    -0.14
    acher
    -0.14
    sted
    -0.14
    POSITIVE LOGITS
    ones
    0.27
    ONES
    0.20
    oth
    0.16
    basic
    0.16
    hte
    0.16
    íıī
    0.15
    PACE
    0.14
     sembl
    0.14
    PELL
    0.14
    annis
    0.14
    Act Density 0.180%

    No Known Activations