INDEX
Explanations
locations or directions
mentions of geographic locations, particularly those labeled as "east" or "west."
New Auto-Interp
Negative Logits
istg
-0.80
vous
-0.79
reluct
-0.78
Deal
-0.77
Number
-0.74
TABLE
-0.73
wcsstore
-0.72
abilia
-0.72
andom
-0.72
thood
-0.71
POSITIVE LOGITS
ward
1.17
side
1.15
wards
1.02
coast
0.93
hemisphere
0.83
side
0.80
wing
0.78
west
0.78
bound
0.78
west
0.77
Activations Density 0.028%