INDEX
Explanations
limiting or disempowering factors
New Auto-Interp
Negative Logits
окружа
0.43
surrounding
0.40
вокруг
0.40
بات
0.37
astics
0.37
side
0.36
inement
0.36
ارين
0.36
neighborhood
0.35
आसपास
0.35
POSITIVE LOGITS
ਂ
0.44
Gä
0.41
autorité
0.40
ädig
0.40
lacks
0.39
suffrage
0.39
⅙
0.39
ುತ್ತದೆ
0.38
lacking
0.37
gä
0.37
Activations Density 0.000%