INDEX
    Explanations

    code imports and headers

    New Auto-Interp
    Negative Logits
     our
    -1.38
     the
    -1.27
    -1.27
    πουργ
    -1.23
     wielu
    -1.22
     harán
    -1.22
    '
    -1.20
     ayudan
    -1.20
    bearded
    -1.17
    -1.16
    POSITIVE LOGITS
    1.83
    1.46
    fbox
    1.44
    1.42
     about
    1.41
    1.36
    μα
    1.34
    følge
    1.34
     posso
    1.32
    一闪
    1.32
    Act Density 0.040%

    No Known Activations