INDEX
    Explanations

    phrases related to knowledge, beliefs, and actions taken by individuals or groups

    statements of knowledge or claims about various subjects

    New Auto-Interp
    Negative Logits
    itaire
    -0.59
    eur
    -0.55
    earcher
    -0.55
    aml
    -0.53
     advoc
    -0.52
    icter
    -0.52
    asus
    -0.51
    pex
    -0.51
    ogl
    -0.50
    Pass
    -0.50
    POSITIVE LOGITS
     themselves
    1.12
    selves
    0.90
     selves
    0.89
     THEIR
    0.65
     their
    0.64
    MpServer
    0.61
     helmets
    0.61
     jointly
    0.60
     li
    0.60
     asses
    0.59
    Act Density 0.859%

    No Known Activations