INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Tasman
    -0.93
     navigu
    -0.48
     slutt
    -0.46
    ompi
    -0.42
    coef
    -0.40
    rootDir
    -0.40
     davran
    -0.40
    dymyr
    -0.39
     Atlántico
    -0.39
    tometer
    -0.39
    POSITIVE LOGITS
     ModelExpression
    0.77
     الحره
    0.69
    DebuggerNonUser
    0.68
     متعلقه
    0.67
    تقاوى
    0.66
     Biôgrafia
    0.64
    بوابة
    0.62
    :+:
    0.60
     الرياضيه
    0.60
    AddTagHelper
    0.59
    Act Density 0.010%

    No Known Activations