A good way to figure out the logic and code for something like this is to bench-check it (i.e., pretend you are the processor and do it step by step by hand using a piece of paper or even index cards as your registers and memory.
For instance, take a sheet of paper and put several columns on it. The first two should be headed "A" and "D" for the two registers. Then put columns for R0, R1, and R2 since you know that you need to use these per the problem statement. Then put a few more, either as actual memory addresses or just add them as variable names as needed (knowing that these will get assigned to addresses starting at 16).
Then put a couple of numbers in R0 and R1 (since that is what the problem statement says will happen). Make them small, but not zero or one (you'll need to check these cases later though). Say 3 and 4.
Now think about how you are going to go about doing this in the simplest (not necessarily most efficient) way you can come up with, probably either adding 3 to itself 4 times or 4 to itself 3 times. Just pick one and go from there. Then proceed to do it, writing down the steps that you take. You can either do it in steps that map directly to individual ASM instructions, or you can do it in human-sized steps as long as you keep in mind that, eventually, you need to convert each of those steps into ASM instructions.