INDEX
Explanations
mentions of physical barriers or boundaries, specifically fences
references to fences
New Auto-Interp
Negative Logits
aming
-0.79
Instant
-0.78
uddin
-0.67
zin
-0.63
uz
-0.62
Kathleen
-0.62
izz
-0.61
ames
-0.61
Q
-0.60
Eng
-0.60
POSITIVE LOGITS
fence
3.92
fences
2.93
fencing
2.10
wall
1.34
barrier
1.32
barric
1.30
railing
1.26
cage
1.23
perimeter
1.19
gate
1.15
Activations Density 0.016%