INDEX
    Explanations

    phrases related to complex systems and their functionalities

    New Auto-Interp
    Negative Logits
     themselves
    -1.07
    themselves
    -0.84
     were
    -0.74
     are
    -0.70
     their
    -0.61
     Their
    -0.58
    Mnemonic
    -0.57
    selves
    -0.57
     توانند
    -0.57
    PLICATE
    -0.55
    POSITIVE LOGITS
    itself
    0.87
     []:
    0.83
     itself
    0.74
    DOES
    0.72
    does
    0.71
    حياتها
    0.68
     ProtoMessage
    0.67
    Дереккөздер
    0.65
    Искәрмәләр
    0.64
     rains
    0.64
    Act Density 0.371%

    No Known Activations