INDEX
Explanations
postcolonial, than, Ghost, or
New Auto-Interp
Negative Logits
𝐌
0.95
muebles
0.93
ی
0.89
𝐖
0.88
afectada
0.83
𝙈
0.82
蹤
0.82
𝐼
0.82
𝐓
0.81
𝐎
0.80
POSITIVE LOGITS
disamb
0.70
Un
0.66
Prak
0.65
flint
0.65
тэ
0.64
ды
0.63
,
0.63
reps
0.63
বুদ্ধ
0.63
Eur
0.62
Activations Density 0.005%