INDEX
    Explanations

    terms related to skills and capabilities

    New Auto-Interp
    Negative Logits
     itself
    -1.23
     its
    -0.98
     Itself
    -0.94
     яке
    -0.90
     Оно
    -0.87
    itself
    -0.85
    它的
    -0.84
     Its
    -0.81
    Its
    -0.79
    its
    -0.76
    POSITIVE LOGITS
    themselves
    0.87
     themselves
    0.87
     herself
    0.86
     которые
    0.77
    Those
    0.71
     lesquelles
    0.70
    those
    0.69
     those
    0.67
    herself
    0.67
     които
    0.65
    Act Density 0.022%

    No Known Activations