INDEX
Explanations
references to specific cities and their characteristics
preceding names of cities or capitals
major cities and capitals
New Auto-Interp
Negative Logits
церковь
-0.45
церкви
-0.43
-0.43
조
-0.42
文庫
-0.41
vo
-0.41
조
-0.41
,
-0.40
<eos>
-0.40
fatta
-0.40
POSITIVE LOGITS
city
1.98
cities
1.61
CITY
1.60
city
1.59
City
1.58
metropolis
1.49
capital
1.49
CITY
1.49
City
1.48
getCity
1.41
Activations Density 0.126%