INDEX
    Explanations

    intentionality or tuning

    New Auto-Interp
    Negative Logits
    intention
    -1.21
     Intention
    -1.20
     involve
    -1.08
     intentional
    -1.02
     Inten
    -0.94
     signal
    -0.93
     intention
    -0.93
     intentionally
    -0.92
     INVOL
    -0.91
     intentions
    -0.90
    POSITIVE LOGITS
    ist
    0.77
    ized
    0.74
    ised
    0.70
    ization
    0.68
    ize
    0.67
    ise
    0.66
    izing
    0.63
    ier
    0.53
    ising
    0.53
    iel
    0.51
    Act Density 0.164%

    No Known Activations