INDEX
Explanations
word or phrase followed by punctuation
New Auto-Interp
Negative Logits
quando
0.48
ﺭ
0.46
Р
0.46
color
0.45
function
0.44
gimana
0.44
kako
0.43
poate
0.43
}$,
0.43
trader
0.42
POSITIVE LOGITS
ຂອງ
0.43
నీ
0.43
هیچ
0.42
ध्य
0.41
географи
0.41
щены
0.40
约束
0.40
莼
0.39
मूलन
0.38
至於
0.38
Activations Density 0.002%