INDEX
Explanations
words related to shape or transformation
references to various shapes and forms
New Auto-Interp
Negative Logits
Mub
-0.79
govtrack
-0.76
onte
-0.74
iance
-0.71
Edited
-0.70
amily
-0.67
uers
-0.65
endment
-0.65
ondo
-0.64
GS
-0.64
POSITIVE LOGITS
shif
0.98
shape
0.93
shape
0.87
shapes
0.78
ly
0.76
fitting
0.75
cut
0.74
Shape
0.72
sheet
0.72
lier
0.72
Activations Density 0.027%