INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     تت
    -0.08
    it
    -0.07
     uno
    -0.07
     hull
    -0.07
     Ut
    -0.07
     Rust
    -0.07
     د
    -0.06
    130
    -0.06
    IT
    -0.06
     уст
    -0.06
    POSITIVE LOGITS
     comparison
    0.16
     compare
    0.14
     compared
    0.14
     Compare
    0.13
     comparing
    0.12
     compares
    0.12
     comparisons
    0.12
    Compar
    0.12
    compare
    0.12
    Comparison
    0.12
    Act Density 0.034%

    No Known Activations