INDEX
Explanations
answering questions or following instructions
New Auto-Interp
Negative Logits
poems
0.76
Alpine
0.74
cookbooks
0.74
</h3>
0.73
cytotoxic
0.73
<unused972>
0.71
<unused723>
0.70
songwriting
0.69
бк
0.69
playwright
0.69
POSITIVE LOGITS
ación
1.04
ního
1.03
ovog
1.00
meu
0.98
ificación
0.94
ový
0.92
äk
0.91
τον
0.91
Meu
0.90
bakalım
0.90
Activations Density 0.117%