INDEX
Explanations
mentions of communities and places
New Auto-Interp
Negative Logits
åĪĹ
-0.16
lop
-0.15
ัย
-0.15
Ukra
-0.15
одав
-0.15
Branch
-0.14
Tent
-0.14
uninsured
-0.14
_mk
-0.14
">ÃĹ</
-0.13
POSITIVE LOGITS
hall
0.18
wealth
0.18
istrat
0.17
antics
0.16
perature
0.15
semble
0.15
inde
0.15
rophe
0.15
chaft
0.15
imitive
0.15
Activations Density 0.005%