INDEX
    Explanations

    descriptions emphasizing strength and effectiveness

    New Auto-Interp
    Negative Logits
     Nap
    -0.61
    i
    -0.60
    Tween
    -0.60
    e
    -0.59
    Nap
    -0.58
    k
    -0.57
     bribe
    -0.57
    E
    -0.55
    T
    -0.54
     Ski
    -0.54
    POSITIVE LOGITS
    Powerful
    1.92
     Powerful
    1.83
     powerful
    1.81
    powerful
    1.72
     puissant
    1.60
     puissante
    1.56
     poderos
    1.41
     powerfully
    1.41
     poderosa
    1.39
     potente
    1.39
    Act Density 0.068%

    No Known Activations