Implementing LINQ's First Method

Deep Dive into LINQ #2

Posted on 2022-03-04
target: .NET 6

Deep Dive into LINQ Series:

Part 1 - Using LINQ `First` Method

Part 2 - Implementing LINQ's First Method

Introduction

This article is the second in the Deep Dive into LINQ series. In the Deep Dive into LINQ series, we will examine the basics of LINQ's internal implementation and performance characteristics.

Continuing from the previous article, we will study LINQ's First method.

This time, we will actually create our own First method to get a rough idea of the process flow.

Create a First method

First method without condition

First, let's create the simplest, non-conditional First method.

public static TSource MyFirst<TSource>(this IEnumerable<TSource> source)

This is a method that works like this:

  • Returns the first element of the sequence
  • Raises InvalidOperationException if the sequence is empty

To enumerate the elements of IEnumerable<T>, use the GetEnumerator method to create an instance of type IEnumerator<T>.

public interface IEnumerator<T>.
{
    bool MoveNext();
    T Current { get; }
}

IEnumerator<T> is an interface for enumerating sequence elements one by one. IEnumerator<T> has a method MoveNext, which "moves to the next element". The current element can be obtained with the Current property. MoveNext returns true if there is a next element, false otherwise.

If we try to implement the First method using this interface, we get the following

public static TSource MyFirst<TSource>(this IEnumerable<TSource> source)
{
    ArgumentNullException.ThrowIfNull(source);

    using var enumerator = source.GetEnumerator();
    // MoveNext If possible (i.e., if the first element exists), return that element
    if (enumerator.MoveNext())
    {
        return enumerator.Current;
    }

    // Throw an exception if MoveNext fails (i.e., if the sequence is empty)
    throw new InvalidOperationException("Sequence is empty");

First, call enumerator.MoveNext(). If it returns true, it has moved to the first element of the sequence, and returns Current. If it returns false, it throws InvalidOperationException because the first element is missing, i.e. the sequence is empty.

Since the processing around MoveNext can be shared with the FirstOrDefault method, we will cut it out into a method called TryGetFirst.

public static TSource MyFirst<TSource>(this IEnumerable<TSource> source)
{
    return source.TryGetFirst(out var first)
        ? first!
        : throw new InvalidOperationException("Sequence is empty"); // throw an exception if no element
}

// Attempt to retrieve the first element and return the result in the out argument if available.
private static bool TryGetFirst<TSource>(this IEnumerable<TSource> source, out TSource? first)
{
    ArgumentNullException.ThrowIfNull(source);

    using var enumerator = source.GetEnumerator();
    // return element if found
    if (enumerator.MoveNext())
    {
        first = enumerator.Current;
        return true;
    }

    // false if no element
    first = default;
    return false;
}

FirstOrDefault method without condition

Let's implement the FirstOrDefault method in the same way. The implementation is almost the same as the First method, except that it throws an exception.

public static TSource? MyFirstOrDefault<TSource>(this IEnumerable<TSource> source) =>
    source.TryGetFirst(out var first) ? first : default;

// Methods that allow default values to be specified
public static TSource? MyFirstOrDefault<TSource>(this IEnumerable<TSource> source, TSource? defaultValue) =>
    TryGetFirst(out var first) ? first : defaultValue;

Conditional First and FirstOrDefault methods

Let's implement the conditional one in the same way. We will create the TryGetFirst method first for this as well.

// Attempts to retrieve the first element that satisfies the condition and returns the result with the out argument if it can be retrieved.
private static bool TryGetFirst<TSource>(this IEnumerable<TSource> source, Func <TSource, bool>predicate, out TSource? first)
{
    ArgumentNullException.ThrowIfNull(source);
    ArgumentNullException.ThrowIfNull(predicate);

    // return if an element is found that satisfies the condition
    foreach (var element in source)
    {
        if (predicate(element))
        {
            first = element;
            return true;
        }
    }

    // false if no element satisfies the condition
    first = default;
    return false;
}

The conditional First method was to return the first element matching the condition. Therefore, I inspect the elements in order until I find one that matches the condition.

Using this TryGetFirst method, the First and FirstOrDefault methods can be implemented as follows

public static TSource MyFirst<TSource>(this IEnumerable<TSource> source, Func <TSource, bool>predicate)
{
    return source.TryGetFirst(predicate, out var first)
        ? first!
        : throw new InvalidOperationException("Sequence is empty"); // exception if no element
}

public static TSource? MyFirstOrDefault<TSource>(this IEnumerable<TSource> source, Func <TSource, bool>predicate) =>
    source.TryGetFirst(predicate, out var first) ? first : default;

public static TSource? MyFirstOrDefault<TSource>(this IEnumerable<TSource> source, Func <TSource, bool>predicate, TSource? defaultValue) =>
    TryGetFirst(predicate, out var first) ? first : defaultValue;

Summary

In this article, we briefly implemented the First method. In the next article, we will look at the actual implementation of the First method in .NET 6 and explore how to optimize it.

Source code is available on GitHub.