INDEX
Explanations
words related to physical barriers or boundaries, particularly fences
references to fences
New Auto-Interp
Negative Logits
olitan
-0.75
Monetary
-0.73
CES
-0.72
":"/
-0.67
TAMADRA
-0.66
iations
-0.64
iation
-0.62
ACTED
-0.62
ortion
-0.61
isky
-0.60
POSITIVE LOGITS
fence
1.16
fences
1.01
erected
0.95
fencing
0.94
separating
0.88
guarding
0.87
barric
0.84
railing
0.83
keepers
0.82
encl
0.81
Activations Density 0.069%