I use SDD since Feb for all my mid+ size projects. This works great for me from many angles:
- two levels of task decomposition - first is multiple steps in workflow, second is task decomposition into multiple subtasks - help to keep session context clean and focused, thus improving adherence, reducing cost
- multiple levels of model understanding verification - you verify first requirements step, then design step and only later code implementation. This helps to ensure that even if miss implementation happens it is rather tactical, on a code level, than strategical - on design level
- agent works as a helper to analyze what you need to build, interview you, builds spec which is better than just short prompt
- works nicely with coding agents orchestrated around task queue
Started with GSD and later implemented my own to fit better my typical size of features.
Might I suggest that you're making the whole process a little more difficult than it needs to be with all those specific tools and frameworks?
Example: I'm building a music player in Swift for MacOS. I want to have visualizers that use Metal to render directly with the GPU. I don't understand GPU rendering. I never will. So, what do I do?
I tell a strong model like Opus or Fable to write a spec first. Tell it that I want "lesser/cheaper models" to be able to do the actual implementation without needing to guess or think too hard about it. It builds something exactly like this: https://github.com/bocan/bocan-music/blob/main/docs/design-s...
As you can see, I don't just tell it to build software. I tell it to build tests. My CLAUDE.md has guardrails. It knows that if it's told to build anything, it has to format, lint, run the new tests and old tests. It knows to keep my coverage above 80%. When it's done, it commits with a highly detailed commit message.
When all that's done. I test the software extensively by hand. Does it do exactly what I want? If so, great. Next feature. If not, I go back and have my lesser/cheaper model fix the code and add tests. I rarely if ever update the specs if it finds bugs as there's no need.
Cool idea, hadn't considered the use of lesser models picking up the implementation set by the strong model plan.
Question regarding the actual code generated, is this a priority to you beyond looking at it meets the design doc? For example, if the LLM goes ahead and builds the feature that passes the tests, but the code contains duplicate functions, abstractions etc, would you steer the LLM to fix that, or even dive in yourself to rearchitect?
I use SDD since Feb for all my mid+ size projects. This works great for me from many angles:
- two levels of task decomposition - first is multiple steps in workflow, second is task decomposition into multiple subtasks - help to keep session context clean and focused, thus improving adherence, reducing cost
- multiple levels of model understanding verification - you verify first requirements step, then design step and only later code implementation. This helps to ensure that even if miss implementation happens it is rather tactical, on a code level, than strategical - on design level
- agent works as a helper to analyze what you need to build, interview you, builds spec which is better than just short prompt
- works nicely with coding agents orchestrated around task queue
Started with GSD and later implemented my own to fit better my typical size of features.
Might I suggest that you're making the whole process a little more difficult than it needs to be with all those specific tools and frameworks?
Example: I'm building a music player in Swift for MacOS. I want to have visualizers that use Metal to render directly with the GPU. I don't understand GPU rendering. I never will. So, what do I do?
I tell a strong model like Opus or Fable to write a spec first. Tell it that I want "lesser/cheaper models" to be able to do the actual implementation without needing to guess or think too hard about it. It builds something exactly like this: https://github.com/bocan/bocan-music/blob/main/docs/design-s...
As you can see, I don't just tell it to build software. I tell it to build tests. My CLAUDE.md has guardrails. It knows that if it's told to build anything, it has to format, lint, run the new tests and old tests. It knows to keep my coverage above 80%. When it's done, it commits with a highly detailed commit message.
When all that's done. I test the software extensively by hand. Does it do exactly what I want? If so, great. Next feature. If not, I go back and have my lesser/cheaper model fix the code and add tests. I rarely if ever update the specs if it finds bugs as there's no need.
Cool idea, hadn't considered the use of lesser models picking up the implementation set by the strong model plan.
Question regarding the actual code generated, is this a priority to you beyond looking at it meets the design doc? For example, if the LLM goes ahead and builds the feature that passes the tests, but the code contains duplicate functions, abstractions etc, would you steer the LLM to fix that, or even dive in yourself to rearchitect?
I use https://github.com/EveryInc/compound-engineering-plugin/blob... from https://every.to.
Here's a readup on it: https://every.to/guides/compound-engineering
There so much better out there, I use https://github.com/ChristopherAlphonse/adadex with gsd skill