INDEX
    Explanations

    references to measuring or quantifying items or actions

    New Auto-Interp
    Negative Logits
    de
    -0.47
    -0.47
    ra
    -0.46
    no
    -0.46
    to
    -0.46
     lo
    -0.43
    na
    -0.43
     or
    -0.42
     (
    -0.42
     .
    -0.41
    POSITIVE LOGITS
     للمعارف
    1.20
     vez
    1.01
     eens
    1.00
    __":
    
    0.95
     gången
    0.94
     time
    0.94
    "):
    
    0.94
     للاسماء
    0.92
     gangen
    0.90
     fois
    0.88
    Act Density 0.093%

    No Known Activations