INDEX
    Explanations

    not but rather constructions

    New Auto-Interp
    Negative Logits
    ITAS
    0.38
    0.38
    KURZ
    0.37
     wrześ
    0.36
                
    0.35
    												
    0.35
    Metaxy
    0.35
     szczeg
    0.34
     rohkem
    0.34
     ক্ষতি
    0.34
    POSITIVE LOGITS
     a
    0.51
     necessarily
    0.50
     straightforward
    0.50
    orious
    0.48
    eworthy
    0.46
     ringing
    0.45
     foolproof
    0.45
     as
    0.44
     commendable
    0.43
     easy
    0.43
    Act Density 0.306%

    No Known Activations