INDEX
    Explanations

    distinctive adjectives and comparative phrases

    New Auto-Interp
    Negative Logits
    elin
    -0.15
    	fflush
    -0.14
    .ax
    -0.14
     Shut
    -0.14
    li
    -0.14
    ven
    -0.14
    lic
    -0.14
    DonaldTrump
    -0.13
     bank
    -0.13
     frank
    -0.13
    POSITIVE LOGITS
    ones
    0.17
    è¾ĥ
    0.16
    ãĥ³ãĥĶ
    0.16
    ICLE
    0.16
     než
    0.16
    -than
    0.15
     Ones
    0.15
     ones
    0.15
     portions
    0.15
    ksam
    0.14
    Act Density 0.109%

    No Known Activations