INDEX
    Explanations

    spiciness or colorful descriptions

    New Auto-Interp
    Negative Logits
    hel
    0.48
     والاح
    0.47
    ليف
    0.47
    ئات
    0.46
    ingles
    0.45
     Milliarden
    0.43
    founders
    0.42
    āk
    0.42
    engen
    0.42
    امل
    0.42
    POSITIVE LOGITS
    0.55
    0.47
     ð
    0.47
     अच्छा
    0.46
     greatly
    0.44
     землю
    0.43
    0.43
     preparação
    0.43
     subcontinent
    0.43
    0.42
    Act Density 0.002%

    No Known Activations