INDEX
    Explanations

    more than and comparisons

    New Auto-Interp
    Negative Logits
    0.38
    很有
    0.34
    0.33
    0.32
     매우
    0.32
     очень
    0.31
    是很
    0.31
    any
    0.31
     بكل
    0.31
    0.31
    POSITIVE LOGITS
     than
    0.83
     niż
    0.78
     kuin
    0.63
     чем
    0.62
    Than
    0.60
     decât
    0.60
    than
    0.59
     allá
    0.58
     než
    0.57
     Than
    0.52
    Act Density 0.052%

    No Known Activations