INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Her
    -0.45
    freien
    -0.45
    }());
    -0.45
    Her
    -0.44
     hal
    -0.43
    inary
    -0.42
    Wh
    -0.42
    WH
    -0.42
    HER
    -0.41
    makedirs
    -0.40
    POSITIVE LOGITS
    примеча
    0.65
     الدولى
    0.65
    GEBURTSDATUM
    0.62
    UserScript
    0.61
     فريبيس
    0.59
    uxxxx
    0.58
     ddelweddau
    0.57
     تانيه
    0.56
     isComment
    0.56
    Jegyzetek
    0.54
    Act Density 0.004%

    No Known Activations