INDEX
Explanations
simple tasks or concepts
references to simplicity or simplification in various contexts
New Auto-Interp
Negative Logits
psey
-1.12
etheus
-0.72
outheast
-0.72
whale
-0.71
Normandy
-0.69
Whale
-0.68
reon
-0.66
everal
-0.66
ificantly
-0.65
whales
-0.64
POSITIVE LOGITS
arithmetic
0.84
ItemImage
0.74
tons
0.73
simple
0.69
simple
0.67
execute
0.67
tru
0.65
PLAY
0.65
ASY
0.64
elegance
0.64
Activations Density 0.275%