INDEX
Explanations
terms related to directions and positions, particularly vertical concepts
terms related to vertical and horizontal orientations or arrangements
New Auto-Interp
Negative Logits
REDACTED
-0.86
mberg
-0.76
utic
-0.73
ghazi
-0.72
bats
-0.72
hell
-0.70
-0.70
atories
-0.70
ãģ®éŃĶ
-0.70
ggies
-0.69
POSITIVE LOGITS
axis
1.05
stripes
1.00
stabil
0.97
scrolling
0.93
orientation
0.90
displacement
0.89
takeoff
0.88
ity
0.87
layout
0.85
rect
0.84
Activations Density 0.040%