INDEX
    Explanations

    code and abstractions

    New Auto-Interp
    Negative Logits
     working
    -0.89
     WORKING
    -0.78
    working
    -0.78
    Working
    -0.76
     Working
    -0.73
     flying
    -0.71
    Personendaten
    -0.70
    ConstraintMaker
    -0.66
    WORKING
    -0.66
     المعيارى
    -0.66
    POSITIVE LOGITS
     work
    0.89
     Work
    0.65
    Work
    0.59
    work
    0.59
     WORK
    0.56
     darb
    0.56
    clusion
    0.44
    ugat
    0.43
     trabalho
    0.43
     Nadal
    0.43
    Act Density 0.000%

    No Known Activations