INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.46
    0.45
    の詳細
    0.45
     softening
    0.44
    ゆっくり
    0.44
     softness
    0.43
     devaluation
    0.43
    0.42
     گذشته
    0.42
    詳細
    0.41
    POSITIVE LOGITS
     summary
    0.62
     succinct
    0.61
     concise
    0.59
     brevity
    0.57
     gọn
    0.56
     concisely
    0.55
    summary
    0.55
     مختصر
    0.52
     succinctly
    0.52
     pith
    0.52
    Act Density 0.013%

    No Known Activations