catbear ([info]catbear) wrote,

An army of me

Let us say you wish to build a custom house. You have the design of it prepared in your head, you have the permits, the materials, and you, personally, have all of the skills needed from site grading to wallpaper hanging. But you would like it to be built in a week. This is so absurd as to be funny.

However, if you were a computer program, and if you ran on Snow Leopard, it might be doable.

Software is just a list of instructions. A computer is a means for following those instructions. Apple's new operating system has "0 new features" for the users, but has some very important new features for the people who create software (the developers), and for the software itself. These new features make building your own house in a week conceptually possible.

The first feature I'll discuss makes writing software easier. This (or rather, these) are formally called "closures", and in concept are similar to the more familiar subroutines. Both are a list of instructions for a particular task. Closures, however, are different from subroutines in a way that makes developers much more efficient. Let's return to our house-builder.

If you have a willing assistant who doesn't know much about building houses, and you want their help, you need to give your assistant some very specific instructions on what to do. Not only what to do, but where to do it. And with what materials. And which tools. This takes a lot of time, of course -- training is what differentiates unskilled labour from craftsmen. Returning to computer software, then, your assistant is similar to a subroutine.

Subroutines are explicit and specific. You must create them by specifying everything you need done. And when a subroutine is run, everything must be in place for it to begin. (All of the tools and lumber must be ready, and you hand your assistant an extremely detailed list of instructions.)

Closures are to subroutines what you (the home builder) are to an assistant. Because closures are typically created with context (materials) already in place, a great deal of setup can be eliminated. And because closures are created with preliminary tasks already in progress, a lot of detailed elucidation can be eliminated. (This is a simplification, and is not always the case, but is a useful illustration.)

In other words, closures allow you to define a program task in terms of how you (yourself) would do it, rather than in terms of how you would tell an assistant to do it. This makes for faster programming and fewer bugs. (Although remaining bugs can be spectacular.)

Closures are not a new feature of programming languages -- they've been around in functional languages for a long time. However, Snow Leopard has added closures to the system and application programming languages C, C++ and Objective C -- where they have never been before.

In so doing, they have upgraded languages which are the defacto standard tools for creating operating systems and end-user applications for, in the case of C, almost forty years.


The other feature I'll cover that was introduced by Snow Leopard is called Grand Central Dispatch. This is a mechanism for creating, if you will, armies of tasks. The old way of getting things done on computers is for everything, every task, to be done in sequence. Of course: old computers only had one processor. Computers these days have two, four, eight or more processors -- and now that processors are executing 2 to 4 billion instructions per second, adding more processors is now much cheaper than trying to make a single processor faster.

A task is like a subroutine -- it's something that needs to be done. Load a file, load the photo, format the document, draw the document on the screen, prepare for printing, print a page, print a page, save the document. etc. Doing these things in sequence makes sense on an old computer where only one thing can be done at a time anyway. But on new computers with multiple processors, we find that sequential programming as above results in a lot of wasted capability. One processor will be running full speed doing all these steps in sequence and the other processors, with nothing to do, are just sitting there.

Grand Central Dispatch allows the programmer to take each of those above tasks and break them apart into work units. If they were all run simultaneously, that would be very fast but not usable -- one cannot begin printing a page before the document has been loaded. What is new with GCD is that when you define work units, you can also specify that they cannot start until some other work unit has finished its task.

As in the above example, then, we could say that the page formatting work unit can't begin until the document loading work unit has completed. And the screen updating and printing work units can't begin until the formatting work unit is complete. There are tasks that are dependent on earlier tasks, and tasks that can be done at the same time. When you write a program using GCD, you define it in terms of tasks, and how those tasks depend on each other.

GCD takes these tasks and the dependencies they have and figures out how to make them run as quickly and efficiently as possible -- all the while running in the correct order. If your computer has two processors, GCD may only set two things running at once. But if your computer has eight processors, you might get eight -- or sixteen things running at once, some of them from other programs you might be using at the same time. Because GCD is a system feature, what really happens is that all of your programs, all of them that are running, create their own tasks and dependency lists, then hand it all over to GCD -- which makes sure everything happens efficiently and correctly.


So, when you think about the combination of closures and GCD, you find that the dominant paradigm of application and system computer programming has been turned on its head. Instead of writing a program which executes a series of very carefully and explicitly defined instructions in purely sequential order, we now have a system which allows you to define an arbitrary set of work units that will operate within a clear and concise context, and execute as rapidly and effectively as possible.

So to get that house built in a week, instead of hiring a thousand untrained assistants, giving them phone-book sized instructions and trying to manage them yourself without accidents and errors -- you just set about to do the job yourself (via the efficiency and clarity of closures), and the system via GCD will create as many clones of you as are necessary for each of you to get the job done as fast as possible -- without stepping on toes, and without hanging wallpaper before the foundation has been poured.

It's an army of me.

This is a huge, huge thing, and once programmers really take hold of it, the next decade will see some real progress in not only usability (fewer "beachballs of doom"), but in creating more complex tools that have fewer errors.

Way to go, folks.

---

* Closures do not mean you don't have to be clear and specific in writing your software -- you still must elucidate everything that must be done. They do mean, however, that you don't have to spend the majority of your programming time writing clumsy and error-prone "plumbing" code that ties things together, and can concentrate much more intently on the functionality that is specific to your application.

  • Post a new comment

    Error

  • 4 comments

[info]galador42

September 8 2009, 16:19:23 UTC 2 years ago

Brilliantly written, so even a non-programmer like myself understands. I'm even more impressed with Apple now.

[info]catbear

September 8 2009, 17:40:42 UTC 2 years ago

Thank you!

[info]funos

September 8 2009, 19:25:39 UTC 2 years ago

Nice post. Good analogies.
And GCD, a group scheduler, sounds like a *fantastic* idea.

My question is: can you define your closures at compile time?
(and better yet: compile them at runtime?)
Because that would generate *customized and optimized* clones of yourself (via function specialization/partial evaluation/constant propagation), which is very powerful juju.

[info]catbear

September 8 2009, 20:01:58 UTC 2 years ago

Thanks.

Native (C/C++/ObjC) closures (referred to as blocks within the documentation) are compile-time, defined similarly to function pointers but with sacks of instantiation, execution and scope-dependent hocus-pocus. There are some good Apple white-papers.

You can, however, get run-time source-based compilation and execution of massively parallel, compute limited, data driven operations using OpenCL, another Snow Leopard goodie. (It's not Apple's alone, but I think it's the first end-user OS to feature it.) Interestingly, Core Image also uses just-in-time compilation to do all that fancy VFX.
Create an Account
Forgot your login or password?
Facebook Twitter More login options
English • Español • Deutsch • Русский…