INDEX
Explanations
phrases related to stairs
references to staircases and elevators
New Auto-Interp
Negative Logits
natureconservancy
-0.74
ucks
-0.73
vironment
-0.73
itarian
-0.72
zsche
-0.72
uala
-0.70
yright
-0.69
Cola
-0.68
essee
-0.68
Samoa
-0.68
POSITIVE LOGITS
ways
1.13
stairs
1.06
stair
0.99
way
0.99
staircase
0.95
WAY
0.93
shaft
0.87
stairs
0.85
slope
0.80
hallway
0.80
Activations Density 0.053%