INDEX
Explanations
mentions of physical walls or barriers
references to protective barriers or restrictions, particularly in digital contexts
New Auto-Interp
Negative Logits
amate
-0.67
cess
-0.66
heny
-0.65
veh
-0.65
costly
-0.64
Gene
-0.59
igure
-0.58
emi
-0.58
expensive
-0.58
sb
-0.58
POSITIVE LOGITS
abies
1.17
papers
1.06
paper
0.99
wall
0.91
wallpaper
0.84
aby
0.80
stones
0.79
igans
0.77
nesday
0.74
iard
0.68
Activations Density 0.016%