INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     AND
    0.38
     কিংবা
    0.34
     અથવા
    0.33
     either
    0.33
    AND
    0.32
    或者是
    0.32
     அல்லது
    0.31
     paraphrase
    0.31
     или
    0.31
     or
    0.30
    POSITIVE LOGITS
     جدا
    0.45
     :)
    0.44
     😁
    0.44
     🙌
    0.42
    0.42
     😊
    0.41
    0.41
    😁
    0.40
    👌
    0.40
     👌
    0.39
    Act Density 0.028%

    No Known Activations