6 Comments
User's avatar
Lex Spoon's avatar

I have been using Cursor a lot. I tried Copilot since it was one of the firsts, and Windsurf because of its good free tier, and now I use Cursor with the lowest level of subscription plan.

The agent mode of these tools feel to me a lot like having an intern. The agent has gobs of energy and is very happy, as you say, to search around the web for ways to do things. Getting a good result, though, involved giving some structure to it to harness all that energy.

Giving that structure, in turn, is a skill set all of its own. Like any skill, it will improve with practice.

Jo Christian Oterhals's avatar

I saw this on Facebook, too. I'm not sure what you're trying to achieve. Could you give an example of input and expected output? I'd like to use a couple of the language models I suggested to you there, and see if they fare better than Copilot.

Gary Luckenbaugh's avatar

I cheated and peaked at the algorithm. It's based on computing "Ershov numbers" which in compiler design, are labels assigned to nodes in an expression tree to determine the minimum number of registers needed to evaluate the expression, optimizing code generation. For single expressions it's not NP, and register allocation CAN be optimized. My triples are isomorphic to a tree, and in fact that's the representation they used in the book. The t's are interior nodes in the tree. I didn't look at the details of the algorithm. I want to see if I can generate the algorithm knowing this hint. There is no way CoPilot could have generated this unless it had been trained on algorithms.

Gary Luckenbaugh's avatar

Here's an example of code that the Unix V6 C compiler produces. This is an ancient, pre-Kernighan and Ritchie C:

% cat t.c

int a,b,c,d,e;

main() {

e = a + b * (c + e) / (a + e);

}

_main:

~~main:

jsr r5,csv

mov _c,r1 ; r1 <- c

add _e,r1 ; r1 <- r1 + e

mul _b,r1

sxt r0 ; sign extension

mov _a,r2

add _e,r2

div r2,r0

add _a,r0

mov r0,_e

L1:jmp cret

Gary Luckenbaugh's avatar

I have a program that converts infix expressions to statements of the form: temp = a op b. I generate new temps for every subexpression like:

a+b*(c-d)-e

Becomes something like:

t1=c-d

t2=b*t1

t3=a+t2

result=t3-e

I think of the t values being similar to registers, but with an infinite supply of them. I'd like to assign them to registers, and only spill them to RAM when necessary. One of the commenters on Facebook pointed out that optimizing register use is an NP problem. There is a heuristic algorithm in the Aho, et. al., compiler text book, but I'm trying to recreate it from memory without looking in the book. I'm not sure why I tried copilot, I guess just to see what it could do, and at first I seemed to be making some progress with it, but then it really went off the rails.

Having done that, I think it's led me to a possible solution, but I don't think the result is as good as the Aho algorithm.

I have the Unix V6 C compiler, and it is insanely good at register allocation. I remember playing with it, and it was next to impossible to get it to spill registers. After decades, I want to go back and experiment with it to see if I can create some pathological cases that would get it to spill registers. The PDP-11 only had 3 general purpose registers out of 8 total, so it should not be that hard. From that I want to reverse engineer the algorithm.

I believe Ritchie used the algorithm documented in the Aho book, it came from that Princeton and Bell Labs crowd.