Musings of a Programmer

Rarely-used blog of Dan Harper.
View all blog posts

PHP 5.5 introduced a new concept called "Generators". Put simply, these allow an ordinary function to "return" multiple times, using the yield keyword.

The go-to example for this, is re-implementing PHP's built-in range(). For example, if you were to call range(0, 1000000) you'd produce a huge in-memory array of one million items before you can iterate through it:

Using generators, it is possible to instead produce the next value only as it's requested, removing the huge memory hog. An example of range re-implemented as a generator named xrange would be:

Notice that yield keyword? That's new. What it does is pause the current method, passing the value to the code iterating over the generator function. Usage is identical:

Soo... that's cool. But how about a real-world example?

Paging the YouTube API

For a recent side-project, I needed to query the YouTube API for videos in a playlist. Their API paginates results at 5-per-page. My initial code looked something like this, repeatedly calling the API to retrieve the next page, pushing results to $fetchedResults.

It worked. But what if the playlist had 600 pages? The $fetchedResults array would become huge.

Also, what if the callee was only iterating until it found a certain result, and would then break out of the loop? It would've loaded everything into memory for nothing.

My initial reaction was to instead return each page at a time, and handle paging outside of the method, but I'd rather the callee not know about pagination. It should just be able to say "give me all the videos".

A solution with generators (note line 13):

Any object using this method doesn't know anything about the implementation details of the YouTube API. It just knows it can request videos, and iterate over those videos.

As each page of videos is handled on-demand, memory usage is kept low.

Further reading on Generators: