INDEX
Explanations
descriptions emphasizing strength and effectiveness
New Auto-Interp
Negative Logits
Nap
-0.61
i
-0.60
Tween
-0.60
e
-0.59
Nap
-0.58
k
-0.57
bribe
-0.57
E
-0.55
T
-0.54
Ski
-0.54
POSITIVE LOGITS
Powerful
1.92
Powerful
1.83
powerful
1.81
powerful
1.72
puissant
1.60
puissante
1.56
poderos
1.41
powerfully
1.41
poderosa
1.39
potente
1.39
Activations Density 0.068%