INDEX
Explanations
shapes or forms
references to various shapes and their characteristics
New Auto-Interp
Negative Logits
onte
-0.73
Mub
-0.72
aires
-0.69
ammy
-0.67
govtrack
-0.66
iance
-0.64
cffffcc
-0.62
endment
-0.62
UES
-0.61
unts
-0.60
POSITIVE LOGITS
shif
1.02
forms
0.97
ly
0.94
shape
0.91
shape
0.91
forming
0.88
Shape
0.86
changing
0.83
liness
0.82
form
0.82
Activations Density 0.067%