RLS

I decided to look into this question yesterday after watching a video of Eric Schmidt, the former CEO of Google, discussing the looming upheaval of the job market due to advancements in “AI”. Opening an instance of Gemini, I got to work testing how useful this tool might be for code generation and found… vibe coding really is (kind of) becoming a thing. Now, I think it would be prudent to disclose that I was using the free version of Gemini, so it’s possible the more advanced version is significantly better. I did use Claude a bit, too, but ran out of free requests pretty quickly and wasn’t able to give it a fair shakedown, so this article will primarily chronicle my experience with Gemini. I also tried using Copilot on Windows 11, but it seemed to be quite a bit behind what the others were capable of and will be left out of this editorial. I should also mention that my approach was not very scientific and deals much more with the resulting output than the code's innards, which I didn't dig into too deeply. This is vibe coding we’re talking about after all. Suffice it to say, you may get different results with different phrasing of your prompts or progressing through your requests differently. Heck, it might do different things on different days with the exact same inputs for all I know. This shouldn’t really matter too much, though, since for this to be a really useful tool, such chaotic tendencies should probably be minimized to the extent it can reliably understand what someone is getting at without excessive specificity or context-dependent responses. I started my evaluation with some requests in an area in which I have become somewhat knowledgeable: WebGL. Since I don’t have very much experience implementing raw WebGL myself aside from writing shaders, I had Gemini use the three.js library, which would give me both a better vocabulary from which to draw on and specific knowledge of the tool set that could be pertinent to troubleshooting. My first request was most likely a bit too complex for it to understand properly. I asked it to generate a “lava lamp-like” animation and got a blank canvas back as a result. After trying to phrase this request in several different ways and getting a non-rendering response back each time, I figured the concept of a lava lamp must be either too vague or too specific (or both!) for the AI to understand how to implement in any way close to the way I wanted. Rather than start with something sort of difficult, I backpedaled to something every three.js venturer encounters on their journey: the spinning cube. It did this dutifully, but I wasn’t exactly happy with how it was being rendered with flat colors and no lighting, nor was I particularly impressed since there are many full-code examples of the spinning cube on the internet that can be copy/pasted verbatim and run properly. To test whether it could go beyond simple copy/pasting, I decided to ask it to change the scene in specific ways, and, sure enough, it ‘understood’ these basic requests and reliably updated the scene by changing the relevant parameters in the code. Changing the material type, the color, adding an emissive map, and even putting a stylized T. rex made from primitive shapes on each face, it did about as well as I could have expected, though I definitely had to stretch my imagination a bit to see the T. rex. For simple changes to a relatively simple project, it seemed as though it could definitely be quicker having an AI find the place in the code to make the change and then implementing that change without having to pore through multiple files and hundreds if not thousands of lines of code and make the change yourself. Still, CTRL+F is a powerful tool in its own right and the instances where trusting an AI to save you maybe a minute or two and perhaps muck something up in the process may not be justifiable in too many cases unless you’re just prototyping something, need immediate visual feedback, and don’t care about leaving a trail of slop in your wake.

Rotating cube generated by Gemini using three.js

Now that I felt comfortable it could do very basic things pretty well, I decided to give it a more challenging test. This time, I chose a project I know a lot of people undertake as they are learning graphics programming for which there would be ample examples out there but for which the implementation details are more complex: a field of grass swaying in the wind. Again, I started with a prompt that was probably a bit too complicated for Gemini to understand initially by asking it to generate the grass using an advanced rendering technique called “shell texturing”, but when I backed off from that and had it generate the grass using “normal meshes”, it did more or less what I wanted… within reason. The results looked terrible, of course, but what it spat out was a good enough starting point that I could start to whittle into something better by making a series of specific changes. After changing how it was constructing the blades of grass, how the wind was deflecting each stalk, how the wind was sweeping over the field, how many blades of grass there were, and how the clouds were being rendered, I got a result that was, honestly, not bad if a junior dev were doing this for the first time and you ignored how poorly optimized and ‘quirky’ it was.

Rolling hills with grass swaying in the wind generated by Gemini with three.js

The struggle began, though, when I wanted to optimize this scene and remove some of its quirks. I tried to have it refactor the code to use instanced rendering for the grass, which is a common optimization technique when you have a lot of identical or very similar objects in a scene. Having done such a conversion from individual meshes to instanced rendering before, I knew this involved a paradigm shift in how you thought about each object in the code and that it might have difficulty integrating all the changes necessary to accommodate this new paradigm into the original scene. After trying several different prompts to implement this change, the best I got back was a landscape with no visible grass. Many prompts simply resulted in errors that prevented the scene from being rendered altogether. Interestingly, though, when I tried to implement this change to instanced rendering with Claude, Anthropic’s AI, it was able to successfully make the transition, though it did add some quirks of its own in the process that I unfortunately wasn’t able to work on before my free requests ran out. I still can’t necessarily say that Claude is better than Gemini, though, because the prompts I used to get to the instanced objects were quite a bit different than with Gemini. After backing off of the instancing with Gemini and trying to ratchet up the complexity of the scene from before, it responded to some requests pretty well and others not so much. I also attempted to render some other scenes like ocean waves and a fighter jet with afterburners with varying degrees of success. Though this was by no means a comprehensive test, I was impressed by what the AI could do, even if the utility is a bit questionable. Getting a good result from one prompt will by no means guarantee you’ll get a decent result from a similar prompt in a different instance of using the AI, which I learned when I tried to recreate the grass scene in a different browser window and had issues that I had not encountered previously and that proved intractable to solve in this new context. It is certainly conceivable that I am not the most skilled “prompt engineer”, but thinking about becoming more advanced at making requests of AI systems prompts a couple of questions.

Wheat blowing in the wind generated by Gemini using three.js

First, is the time it would take to learn how to properly talk to an AI worth it when you could just spend that time learning the subject matter for which you are trying to use the AI? It already seems pretty apparent that without knowing many of the implementation details of what you’re trying to do and having the technical vocabulary to guide the AI’s responses that your chances of getting the results you want decrease significantly. To further expound on this question, when you consider the time invested in learning how to prompt AIs as they currently operate, you would need to consider the speed of progress that such systems have been exhibiting recently and that by the time you get to the point you are near optimal in prompting these systems, they have gotten to the level that they can ‘comprehend’ natural language and context much better to give you desirable results without jumping through all the linguistic hoops. Secondly, it’s clear that for any task of even slight complexity, there is going to be a good amount of work on the back end manually cleaning things up. It may be worth it in a lot of cases to have an AI churn out a bunch of code that you then go in and prune. I can definitely see some solid use cases for this, but I would say this with the caveat that even a slightly complex coding project could need major refactoring, since the results can be so inconsistent even from very similar prompts. Would it not be better just to build out a project by forking an existing related project from a Github repo you know doesn’t have these issues or building it from scratch, so you know the code base inside and out from the start? There’s also the matter of becoming too reliant on AI to generate code and having your knowledge atrophy— knowledge that is still ostensibly needed to guide the AI properly—as a result of not doing the hard work yourself. Overall, from what I’ve seen using the free versions of Gemini and Claude, these tools are clearly starting to show their value to increase productivity in the programming realm, though I haven’t quite figured out where exactly that would be best achieved. In the next few months, I may find uses for it when I need to slightly tweak some boilerplate code. For pure boilerplate stuff, I still trust CTRL+C/CTRL+V over the risk of an AI hallucinating or pulling code from a bad source. For anything more custom, I feel like watching and waiting to see how it advances in the next year is the way to go. “AI” has been getting a lot of hype recently, and while there are definitely companies grifting off of it, I think the hype is generally well-deserved for the real deal that already appears to be here to some degree and the bulk of the iceberg seemingly just over the horizon with the current rate of progress. This was far from a comprehensive evaluation of these tools and left out other major players like OpenAI, but the vibes I got from this little experiment were definitely promising for the near future assuming things don't hit a massive roadblock. I may revisit this topic in a few months testing out some other applications, perhaps even splurging to try out the professional versions. For now, I'm going to work on improving my hard skills and save the vibes for my stereo.

Thoughts

Thoughts

Is Vibe Coding Really A Thing?