INDEX
Explanations
transparency and customization
New Auto-Interp
Negative Logits
gift
0.43
entspricht
0.42
decline
0.42
persuasion
0.41
legislative
0.41
批准
0.39
ellingen
0.38
ichtig
0.38
illings
0.38
റിന്റെ
0.38
POSITIVE LOGITS
Polymers
0.52
гно
0.45
सम्प
0.44
Rodr
0.44
подру
0.43
смог
0.43
raí
0.43
さ
0.42
penyel
0.42
Име
0.42
Activations Density 0.007%