INDEX
    Explanations

    words compared with than

    New Auto-Interp
    Negative Logits
     соответствии
    0.85
     يتع
    0.79
    ):
    0.77
     sadece
    0.75
    0.75
     uniquement
    0.74
     yalnızca
    0.74
    ogliamo
    0.73
    最初に
    0.72
     Grâce
    0.72
    POSITIVE LOGITS
     than
    3.80
     Than
    2.77
    than
    2.75
     niż
    2.48
    Than
    2.48
     än
    2.32
     kuin
    2.31
     decât
    2.04
     compared
    2.01
     než
    1.95
    Act Density 0.319%

    No Known Activations