INDEX
    Explanations

    code and math

    New Auto-Interp
    Negative Logits
    -trained
    -0.07
    Jesus
    -0.07
    strict
    -0.07
     practiced
    -0.07
    ])**
    -0.07
    accepted
    -0.06
     filho
    -0.06
     clauses
    -0.06
    ódigo
    -0.06
    Faces
    -0.06
    POSITIVE LOGITS
    :@"%@",
    0.07
    0.07
    0.06
     С
    0.06
    0.06
    報告
    0.06
     خانم
    0.06
    تمبر
    0.06
    0.06
     harek
    0.06
    Act Density 0.088%

    No Known Activations