INDEX
    Explanations

    perfect describing something

    New Auto-Interp
    Negative Logits
     as
    1.55
    ;
    1.29
     и
    1.19
    Ο
    1.13
     but
    1.12
     a
    1.11
    ).
    1.11
     ancak
    1.11
     thiab
    1.09
     טוב
    1.08
    POSITIVE LOGITS
    in
    1.16
    д
    0.97
    inę
    0.92
    تها
    0.89
    uña
    0.88
    لاً
    0.88
    0.87
    inį
    0.87
    innt
    0.86
    дят
    0.84
    Act Density 0.020%

    No Known Activations