INDEX
Explanations
greetings and introductions
New Auto-Interp
Negative Logits
Control
0.79
Elim
0.74
Organic
0.73
맨
0.70
Http
0.69
Organ
0.69
coordinated
0.69
封
0.69
Delete
0.68
mature
0.67
POSITIVE LOGITS
name
0.80
nome
0.74
做什么
0.72
genus
0.72
permisos
0.71
chiamato
0.71
appelée
0.71
um
0.71
Nice
0.70
puis
0.70
Activations Density 0.279%