INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     BoxDecoration
    -0.84
    IntoConstraints
    -0.79
     surla
    -0.76
    AndEndTag
    -0.75
     للمعارف
    -0.71
    DeleteBehavior
    -0.71
    TemporalType
    -0.71
     سكانية
    -0.70
    URLException
    -0.69
    ="@+
    -0.68
    POSITIVE LOGITS
    <bos>
    0.49
     conseils
    0.36
    Consejos
    0.35
      
    0.31
     abierto
    0.31
     charité
    0.30
     consejos
    0.30
    0.30
     mauvaises
    0.29
     tulee
    0.28
    Act Density 0.014%

    No Known Activations