Skip to content

Iterator

In Go, the keyword used to iterate over specific data structures is for range. Some of its applications have been introduced in previous chapters. It can only work on several built-in data structures:

  • Array
  • Slice
  • String
  • Map
  • Channel
  • Integer value

This makes it very inflexible to use with no extensibility, and almost no support for custom types. However, after the Go 1.23 version update, the for range keyword supports range over func, making custom iterators possible.

Introduction

Let's get to know iterators through an example. Do you remember the closure Fibonacci sequence example explained in the function section? Its implementation code is as follows:

go
func Fibonacci(n int) func() (int, bool) {
  a, b, c := 1, 1, 2
  i := 0
  return func() (int, bool) {
    if i >= n {
      return 0, false
    } else if i < 2 {
      f := i
      i++
      return f, true
    }

    a, b = b, c
    c = a + b
    i++

    return a, true
  }
}

We can transform it into an iterator, as shown below. You can see the code is reduced:

go
func Fibonacci(n int) func(yield func(int) bool) {
  a, b, c := 0, 1, 1
  return func(yield func(int) bool) {
    for range n {
      if !yield(a) {
        return
      }
      a, b = b, c
      c = a + b
    }
  }
}

Go's iterator is range over func style. We can use the for range keyword directly, which is more convenient than before:

go
func main() {
    n := 8
  for f := range Fibonacci(n) {
    fmt.Println(f)
  }
}

Output:

0
1
1
2
3
5
8
13

As shown above, an iterator is a closure function that accepts a callback function as a parameter. You can even see the word yield in it. Those who have written Python should be familiar with it. It's very similar to generators in Python. Go's iterator doesn't add any new keywords or syntax features. In the above example, yield is just a callback function, not a keyword. The official name is for convenience of understanding.

Push Iterator

Regarding the definition of iterator, we can find the following explanation in the iter library:

An iterator is a function that passes successive elements of a sequence to a callback function, conventionally named yield.

One thing we can clarify from this is that an iterator is a function that accepts a callback function as a parameter. During iteration, it passes elements of the sequence one by one to the callback function yield. In the previous example, we used the iterator in the following way:

go
for f := range Fibonacci(n) {
    fmt.Println(f)
}

According to the official definition, the usage of the iterator Backward example above is equivalent to the following code:

go
Fibonacci(n)(func(f int) bool {
    fmt.Println(f)
    return true
})

The body of the loop is the iterator's callback function yield. When the function returns true, the iterator continues iterating; otherwise, it stops.

Additionally, the iter standard library also defines the iterator types iter.Seq, which is a function type:

go
type Seq[V any] func(yield func(V) bool)

type Seq2[K, V any] func(yield func(K, V) bool)

iter.Seq's callback function only accepts one parameter, so during iteration, for range has only one return value:

go
for v := range iter {
  // body
}

iter.Seq2's callback function accepts two parameters, so during iteration, for range has two return values:

go
for k, v := range iter {
  // body
}

Although the standard library doesn't define a Seq with 0 parameters, this is also completely allowed, which is equivalent to:

go
func(yield func() bool)

Usage is as follows:

go
for range iter {
  // body
}

The callback function can only have 0 to 2 parameters; more will not compile.

In short, the loop body in for range is the yield callback function in the iterator. How many values for range returns, the corresponding yield function has that many input parameters. In each iteration, the iterator calls the yield function, which executes the code in the loop body, actively passing elements in the sequence to the yield function. This kind of iterator that actively passes elements is generally called a push iterator. A typical example is foreach in other languages, like JavaScript:

javascript
let arr = [1, 2, 3, 4, 5];
arr
  .filter((e) => e % 2 === 0)
  .forEach((e) => {
    console.log(e);
  });

In Go, it's represented by range returning the iterated elements:

go
for index, value := range iterator() {
  fmt.Println(index, value)
}

In some languages (like Java), it has another name: data stream processing.

Since the code in the loop body is passed to the iterator as a callback function, and it's likely a closure function, Go needs to make a closure function behave like a normal loop body code segment when executing keywords like defer, return, break, goto. Consider the following situations.

For example, returning in an iterator loop - how should this return be handled in the yield callback function?

go
for index, value := range iterator() {
    if value > 10 {
        return
  }
  fmt.Println(index, value)
}

It's impossible to return directly in the callback function. Doing so would only stop the iteration without achieving the return effect:

go
iterator()(func(index int, value int) bool {
  if value > 10 {
    return false
  }
  fmt.Println(index, value)
})

Or using defer in an iterator loop:

go
for index, value := range iterator() {
    defer fmt.Println(index, value)
}

You also can't use defer directly in the callback function, because doing so would trigger the deferred call when the callback function ends:

go
iterator()(func(index int, value int) bool {
  defer fmt.Println(index, value)
})

The same applies to other keywords like break, continue, goto. Fortunately, Go has handled these situations for us, so we just need to use them. You don't need to worry about these for now. If interested, you can browse the source code in rangefunc/rewrite.go.

Pull Iterator

A push iterator is controlled by the iterator for iteration logic, with users passively getting elements. Conversely, a pull iterator is controlled by users for iteration logic, actively getting sequence elements. Generally, pull iterators have specific functions like next() and stop() to control the start or end of iteration. It can be a closure or a struct.

go
scanner := bufio.NewScanner(file)
for scanner.Scan() {
    line, err := scanner.Text(), scanner.Err()
    if err != nil {
        fmt.Println(err)
        return
    }
    fmt.Println(line)
}

As shown above, Scanner gets the next line of text from the file through the Text() method and indicates whether iteration has ended through the Scan() method. This is also a pattern of pull iterator. Scanner uses a struct to record state, while pull iterators defined in the iter library use closures to record state. We can convert a standard push iterator to a pull iterator through iter.Pull or iter.Pull2 functions. The difference between iter.Pull and iter.Pull2 is that the latter has two return values. Their signatures are:

go
func Pull[V any](seq Seq[V]) (next func() (V, bool), stop func())

func Pull2[K, V any](seq Seq2[K, V]) (next func() (K, V, bool), stop func())

They both accept an iterator as a parameter and return two functions next() and stop() for controlling iteration continuation and stopping.

go
func next() (V, bool)

func stop()

next returns the iterated element and a boolean indicating whether the current value is valid. When iteration ends, the next function returns the zero value of the element and false. The stop function ends the iteration process. When the caller no longer uses the iterator, they must use the stop function to end the iteration. By the way, calling the same iterator's next function in multiple goroutines is wrong because it's not concurrency-safe.

Let's demonstrate with an example that transforms the previous Fibonacci iterator into a pull iterator:

go
func main() {
  n := 10
  next, stop := iter.Pull(Fibonacci(n))
  defer stop()
  for {
    fibn, ok := next()
    if !ok {
      break
    }
    fmt.Println(fibn)
  }
}

Output:

0
1
1
2
3
5
8
13
21
34

This way we can manually control iteration logic through next and stop functions. You might think this is unnecessary. If you want to do this, why not just use the original closure version? You can control iteration yourself. The closure usage is like this:

go
func main() {
  fib := Fibonacci(10)
    for {
        n, ok := fib()
        if !ok {
            break
        }
        fmt.Prinlnt(n)
    }
}

Transformation process: closure → iterator → pull iterator. The usage of closures and pull iterators is similar, and their ideas are the same. The latter may also suffer performance drag due to various processing. Honestly, doing this is indeed unnecessary, and its application scenarios are indeed not many. However, iter.pull exists for iter.Seq, that is, to convert push iterators to pull iterators. If you just want a pull iterator and specifically implement a push iterator to convert, you might want to consider your implementation complexity and performance. Like the Fibonacci sequence example, going in a circle back to the starting point. The only benefit might be conforming to the official iterator specification.

Error Handling

What if an error occurs during iteration? We can pass it to the yield function to let for range return, letting the caller handle it, like this line iterator example:

go
func ScanLines(reader io.Reader) iter.Seq2[string, error] {
  scanner := bufio.NewScanner(reader)
  return func(yield func(string, error) bool) {
    for scanner.Scan() {
      if !yield(scanner.Text(), scanner.Err()) {
        return
      }
    }
  }
}

TIP

Note that the ScanLines iterator is one-time use. After the file is closed, it cannot be used again.

You can see its second return value is error type. Usage is as follows:

go
for line, err := range ScanLines(file) {
    if err != nil {
        fmt.Println(err)
        break
    }
    fmt.Println(line)
}

This handling is no different from normal error handling. The same applies to pull iterators:

go
next, stop := iter.Pull2(ScanLines(file))
defer stop()
for {
    line, err, ok := next()
    if err != nil {
        fmt.Println(err)
        break
    } else if !ok {
        break
    }
    fmt.Println(line)
}

If a panic occurs, just use recovery as usual:

go
defer func() {
    if err := recover(); err != nil {
        fmt.Println("panic:", err)
        os.Exit(1)
    }
}()

for line, err := range ScanLines(file) {
    if err != nil {
        fmt.Println(err)
        break
    }
    fmt.Println(line)
}

The same applies to pull iterators, so I won't demonstrate here.

Standard Library

Many standard libraries also support iterators. The most commonly used are slices and maps standard libraries. Here are some practical features:

slices.All

go
func All[Slice ~[]E, E any](s Slice) iter.Seq2[int, E]

slices.All converts a slice into a slice iterator:

go
func main() {
  s := []int{1, 2, 3, 4, 5}
  for i, n := range slices.All(s) {
    fmt.Println(i, n)
  }
}

Output:

0 1
1 2
2 3
3 4
4 5

slices.Values

go
func Values[Slice ~[]E, E any](s Slice) iter.Seq[E]

slices.Values converts a slice into a slice iterator without index:

go
func main() {
  s := []int{1, 2, 3, 4, 5}
  for n := range slices.Values(s) {
    fmt.Println(n)
  }
}

Output:

1
2
3
4
5

slices.Chunk

go
func Chunk[Slice ~[]E, E any](s Slice, n int) iter.Seq[Slice]

slices.Chunk function returns an iterator that pushes slices of n elements to the caller:

go
func main() {
  s := []int{1, 2, 3, 4, 5}
  for chunk := range slices.Chunk(s, 2) {
    fmt.Println(chunk)
  }
}

Output:

[1 2]
[3 4]
[5]

slices.Collect

func Collect[E any](seq iter.Seq[E]) []E

slices.Collect function collects a slice iterator into a slice:

go
func main() {
  s := []int{1, 2, 3, 4, 5}
  s2 := slices.Collect(slices.Values(s))
  fmt.Println(s2)
}

Output:

[1 2 3 4 5]

maps.Keys

go
func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]

maps.Keys returns an iterator that iterates over all keys of a map. Combined with slices.Collect, it can be collected directly into a slice:

go
func main() {
  m := map[string]int{"one": 1, "two": 2, "three": 3}
  keys := slices.Collect(maps.Keys(m))
  fmt.Println(keys)
}

Output:

[three one two]

Golang by www.golangdev.cn edit