INDEX
Explanations
to help ask for more information
New Auto-Interp
Negative Logits
avoid
1.16
choose
1.00
避免
0.92
avoid
0.90
Avoid
0.90
jot
0.89
evitar
0.86
Avoid
0.85
stay
0.84
говоря
0.83
POSITIVE LOGITS
ніка
0.89
자들이
0.86
เคราะห์
0.85
pptd
0.82
manpower
0.82
लिट
0.80
Helps
0.80
موجود
0.79
جزء
0.79
helps
0.79
Activations Density 0.011%