INDEX
Explanations
certainly, indeed, undoubtedly
New Auto-Interp
Negative Logits
গোপন
0.35
مباشر
0.35
0.34
mixture
0.34
ոչ
0.34
решил
0.33
abas
0.33
সরাস
0.33
dépour
0.33
langsung
0.32
POSITIVE LOGITS
確かに
0.93
certainly
0.78
Certainly
0.75
确实
0.75
indeed
0.75
certes
0.75
certamente
0.73
undoubtedly
0.70
Certainly
0.68
確實
0.65
Activations Density 0.618%