INDEX
    Explanations

    exaggerated

    New Auto-Interp
    Negative Logits
    uting
    -0.08
     equilibrium
    -0.08
    anyana
    -0.07
    initi
    -0.07
     گزار
    -0.07
     ESA
    -0.07
    Paragraph
    -0.07
     mitt
    -0.07
     అప
    -0.07
    Aplic
    -0.07
    POSITIVE LOGITS
     exaggerated
    0.19
     exagger
    0.16
     exager
    0.15
     caric
    0.12
     outrageous
    0.11
     unrealistic
    0.11
     extravagant
    0.11
     inflated
    0.10
     generous
    0.10
     grotes
    0.10
    Act Density 0.016%

    No Known Activations