This article contains affiliate links. See my affiliate disclosure for more information.

I like ChatGPT. A lot. It's the best digital writing assistant I've ever used.

If my time using ChatGPT has taught me anything, though, it's that ChatGPT isn't a brilliant writer. I'm not worried about my job. And if my experience using ChatGPT to write a Python program from scratch is any indication, programmers don't need to worry either.

But you should be paying attention.

This article was originally published in my Curious About Code newsletter. Never miss an issue. Subscribe here →

A Promising Failure

I've been thinking about building a terminal app to practice typing for a while, so I decided to give ChatGPT a shot at building it.

This was my prompt:

I want you to write a Python program for me that can be used to practice typing from the terminal.  The app should pick on of 5 paragraphs of text and display it to the user. The user will then type the text displayed in the terminal. The characters they type will be displayed below the paragraph, and will be colored according to whether or not the user typed the correct character: green if it is correct or red otherwise. When the user has finished typing the paragraph, the typing speed (in words per minute) and typing accuracy (as a percentage of the total number of characters in the paragraph) should be displayed, and the program should exit.  Here are some more requirements:  - Paragraphs should contain text that is appropriate for typing practice. - Paragraphs should not end with a newline character. - As soon as the user has typed the same number of characters as are contained in the paragraph, the program should stop reading input and display the typing speed and accuracy.

ChatGPT's program is surprisingly close to what I want:


It's impressive how much ChatGPT gets right here. The code runs error-free. It gets the accuracy calculation correct and computes raw speed in words per minute. It is, for all practical purposes, a functional typing analysis app.

But it doesn't color the user input as I requested:

Again, the code runs. Here's a sample execution:

That's not quite what I want, but it's getting closer. I tell ChatGPT that the program doesn't work as expected and reiterate that the colors should be applied to the text as the user types it.

Things begin to derail:

It looks reasonable, but it doesn't work:

I spent a good half an hour trying to get ChatGPT to fix its code, but things just kept getting weirder and weirder. I let it know how I felt before I closed the chat:

In reality, though, I was impressed. Like I'd found for generating text, ChatGPT struggles to create good code en masse. But it captured enough of what I asked for that it inspired a new plan of attack:

Operate on smaller chunks.

A Productive Approach

Up until now, I had interacted with ChatGPT as a consumer with no knowledge of programming. For my second attempt, I took a more active role in solving the problem and gave ChatGPT much more specific instructions.

I split up the program into several functions. I already knew how to tie them together, and was curious to see if ChatGPT could fill in the gaps.

This surprised me. Based on my previous conversation, I expected a solution using but without using tty.setraw(). It would be incorrect, but consistent with the kind of errors I saw before.

It's also worth noting that ChatGPT's implementation won't work on Windows machines. I use a mac, so I left the code as is and continued:

Pretty good! Just needs some color:

Insert comment about f-strings.

Next, I asked ChatGPT to write a function to calculate the gross typing speed. Here's what I got:

This calculates raw typing speed by dividing the number of words typed by the time taken. The problem is: this metric ignores the length of words and overlooks symbols such as spaces and punctuation. None of these alter the word count, but they do impact typing speed. I asked for the gross speed, which considers any sequence of five characters to be a word.

It’s a flaw in ChatGPT’s code, but an easy manual fix:

def typing_speed(text, start_time, end_time):
    words = len(text) / 5
    minutes = (end_time - start_time) / 60
    return words / minutes

I moved on to the next function I had in mind:

This one really stands out to me. The unsolicited addition of printing the typing speed threw me off at first. But then it gave me an idea. I altered my design:

ChatGPT had most of the pieces it needed now. Could it finish the program for me?

The accuracy() function works, but can be simplified. I can do that manually:

def accuracy(typed_list):
    correct = 0
    for typed, expected in typed_list:
        if typed == expected:
            correct += 1
    return correct / len(typed_list) * 100

I'd also never seen ChatGPT use comments as placeholders like that. That's a nice way to shorten a long code block that references prior chat interactions.

I put it all together and ran it:

Not sure that characterizes gentlemen, but ok.

Exactly what I wanted. The whole process took about 10 minutes.

Thoughts On ChatGPT

Take them or leave them.

  • ChatGPT struggles to generate long bodies of text coherently. That's not surprising, given that it's optimized for short, chat-style interactions.
  • I'm skeptical that large language models will ever be able to generate large, complex software automatically. You can, however, guide a model to build something relatively complex.
  • ChatGPT codes like an expert beginner. It can help you be productive, but it can't be trusted.
  • The best use cases for ChatGPT-like models are generating, refactoring, reformatting, and improving small code snippets. I think my experiment also shows that ChatGPT can be used to quickly prototype an idea.
  • I can't help but wonder if large language models will be used to improve translation from natural to machine language, enabling a new paradigm of spoken-word programming.

The era of the programmer is not over. But the landscape is changing.

Update: February 13, 2023

A reader commented on Hacker News that they got ChatGPT to write the program in just two prompts by adding one sentence to my original prompt and then asking it to make a correction.

It's an interesting approach, but after trying it out several times, I couldn't replicate Alexander's success consistently. ChatGPT is non-deterministic, which may explain that. I also had mixed results getting ChatGPT to fix small errors in large code samples. Often, the model struggled to "remember" code that it had previously written and replaced code that worked correctly with entirely new code with different behavior.

In a continuation of his response on Twitter, Alexander says, "Contrary to Hackernews myths, [ChatGPT] can code advanced software." But a simple command-line app isn't advanced software. It takes teams of dozens, even hundreds, of human programmers to write advanced software. I haven't seen any evidence that large language models can do that, or will any time soon.

I have seen evidence that they will make writing software accessible to more people, many of whom will never learn a computer programming language.