INDEX
Explanations
foreign languages and specific nouns
New Auto-Interp
Negative Logits
수도
0.41
苏州
0.40
वायरिंग
0.39
asmussen
0.39
yaa
0.39
ității
0.38
atthena
0.38
졌
0.38
planung
0.38
गृह
0.37
POSITIVE LOGITS
predomin
0.46
nons
0.43
predominantly
0.42
undesirable
0.42
uncon
0.41
فيها
0.41
unpredictable
0.40
detractors
0.40
أي
0.39
স্বপ্ন
0.39
Activations Density 0.000%