INDEX
Explanations
saying truths and explaining needs
New Auto-Interp
Negative Logits
கார்ப
0.45
внед
0.45
АР
0.42
fornis
0.42
广泛
0.42
collaboratively
0.42
innovations
0.41
riduzione
0.41
модерни
0.41
చేపట్ట
0.40
POSITIVE LOGITS
害怕
0.46
je
0.45
vraiment
0.45
mấy
0.45
怕
0.45
覺得
0.44
sakit
0.44
觉得
0.44
知道
0.43
nghĩ
0.43
Activations Density 0.008%