Philip J. Guo - Personal Home Page
Main Menu

Introductory Computer Programming Education

by Philip Guo (philip@pgbovine.net)

This article presents my opinions on some problems with how computer programming is currently taught and what types of courses I feel are good for teaching programming to advanced high school and early college students. This is not a proposal for a complete undergraduate Computer Science curriculum, only for introductory courses that teach programming and some rudimentary Computer Science concepts. (I'm not an education expert by any means; these opinions are just drawn from my own learning and programming experiences.)

Problems with how computer programming is currently being taught

My proposed introductory computer programming curriculum

Here are the courses that I would teach in an introductory programming curriculum; hopefully these courses can eliminate some of the problems I've described in the previous section.

  1. Introduction to programming concepts and applications, taught using Python immersed in a UNIX command-line environment - I strongly believe in teaching Python to students as a first language (see here for some reasons) because it's so easy to get started solving small but useful problems with minimal overhead. This course should teach the basic concepts of procedural programming (e.g., control flow, basic data structures, functions), and there should be a heavy emphasis on solving real-world problems rather than toy problems. Concurrently with learning Python, students should be learning to get comfortable in a command-line environment, familiarizing themselves with how to create and manipulate files and directories and using an assortment of standard UNIX tools. Being able to work on the command-line as well as to program in Python allows students to create applications that work on real data and produce real results rather than being confined to one program working in isolation. Students should be exposed to a wide variety of libraries for interesting tasks such as downloading content from the web, parsing HTML, manipulating images, developing simple plug-ins to instant messaging clients or social networking sites, etc.

    By the end of this course, my pipe dream is for students to be able to recognize how to solve some simple computing problems that they face every day (e.g., organizing their music collection) by writing small Python scripts. I do not expect students to be able to construct software systems of any significant size only after taking this course, but they can actually achieve a lot with a 100-line Python program :) The point of this course is to give students exposure to the power of programming and to motivate them to learn more about programming in later courses.

  2. Introduction to software engineering, taught using Java within the Eclipse IDE - After learning how to do quick-and-dirty hacks using Python to simply 'get stuff done', students should learn the concepts of how to engineer reliable medium-sized software systems, this time using a much more strict (and industry-accepted) language: Java. Students should learn about abstraction, specifications, modularity, decoupling, testing, debugging, and other software engineering concepts, with the assumption that they know how to do basic programming. This will also be the students' first exposure to object-oriented programming and some simple design patterns. Also, they will learn to work in a modern feature-rich IDE (Eclipse) and learn how to use the integrated debugger and other useful features (such as refactorings, on-the-fly compilation, jumping to identifier definitions and uses).

    This course should culminate in a team project where a group of 3-4 students work together to build a GUI program that they could be proud to show to their friends or family. After all, command-line programs just aren't very sexy, and most computer users think of GUI programs as the ONLY type of program, so by building a GUI program, students can see that they too can create something that looks and feels sort of like the programs they use in their everyday computing (thus bridging that conceptual disconnect between 'toy programs' and 'real programs'). By the end of this course, students should have some ideas of how to design and implement a medium-sized software system, rather than just hacking together spaghetti code, and be prepared to work as a software engineering intern in the industry in order to reinforce what they learned in class.

  3. Introduction to low-level systems programming, taught using C in a UNIX command-line environment - The main reason that I want to teach a programming course in C is to show students how to do low-level things that are easy in C but impossible in higher-level safe languages, most notably treating memory as untyped bytes and manipulating it at will, casting the hell out of pointers and totally abusing the (already weak) type system. One possible application is to build a simple database application that needs to divide up a chunk of memory into records, and depending on meta-data stored within each record, interprets the bytes of the records as different types (e.g., int, string, bool), and also stores/loads chunks of raw bytes to/from the hard disk.

    The general idea is to show how tricky it is to get things right when doing low-level programming, and to give students a taste of what's really going on 'under the hood', but at the same time, to also allow students to know that they really don't need to be programming in C anymore (except for low-level systems software)! Students should get familiar with using gdb to debug scary and frustrating memory corruption bugs and to build up some mental fortitude and gain a greater appreciation for modern languages.

    Another potentially interesting way to teach this class would be to use C to manually implement basic data structures and algorithms which were taken for granted when programming in Python and Java, in order for students to fully appreciate the benefits of using a higher-level language and also to learn know how to write more efficient code in higher-level languages because they know better how things are implemented 'under-the-hood' (see the article Back to Basics by Joel Spolsky).

    Programs written in C (most notably operating systems, libraries, and compilers/interpreters for higher-level languages) form the foundation of all modern software systems (in the old days, machine-specific assembly language code reigned supreme, but nowadays, C compilers can create more optimized code than humans can for most domains). Thus, even if students don't need to write C programs in the future, they should understand the foundation upon which their own software is built.

  4. (Optional) Introduction to functional programming, taught concurrently using both Scheme and ML - This optional course is for people who are REALLY curious about more advanced programming language concepts, most notably recursion, higher-order functions, closures, polymorphic type inference (wha???), and functional programming. Scheme and ML should be taught concurrently to emphasize their similarities and differences, especially the tradeoffs between dynamic and static typing. Because I don't want to bore everyone except for the math/theory geeks, this class should still at least attempt to present some halfway-practical problems. Perhaps good motivation for functional programming can come from re-implementing solutions to problems in more elegant ways than could be done in C or Java.

Programming languages to never teach to beginners

Related Work

Several other technical bloggers have written about the topic of what to teach students in an introductory programming curriculum, and there are many varied viewpoints.

A reputable hacker, Eric S. Raymond, shares my views on the Python -> Java -> C order of teaching programming to beginners, as described in a section of How To Become A Hacker.

Lots of high school and college educators now favor an all-Java curriculum (the high school AP Computer Science curriculum moved from C++ to Java a few years ago). In contrast, Joel Spolsky is 'old school' and favors the Back to Basics approach of starting low-level with C and then progressing to more theoretical Computer Science concepts with Scheme. Although I support teaching both C and Scheme, I think that they should come only after teaching so-called 'easier' languages such as Python and Java, because my #1 priority is to get students to be motivated to write interesting and useful programs, not to have them struggle to understand the intricate arcane details of pointers (taught using C) or recursion (taught using Scheme).

Acknowledgments

Thanks to Robert Ikeda for his input into this topic and for listening to my rants late one evening when I should have been working on research.


Feel free to send comments, suggestions, questions, or rants to me via email: philip@pgbovine.net

Here are some email responses to this article.

Created: 2007-05-24
Last modified: 2007-06-17