DEV Community

delacry
delacry

Posted on

Type-safe collections in PHP 8.4: what I wish arrays had

Index gaps and the pipe operator

You know the feeling. You're three callbacks deep in array_map, you tack on array_filter, then array_values to fix the keys, and PhpStorm gives up and types everything as array. PHPStan was useful five lines ago. Now it's just nodding politely.

I spent six months of weekends building a library to fix this. It's called noctud/collection, it's PHP 8.4+ only, and this post is about why.

The everyday pain

Here's a scene that probably looks familiar:

$activeAdmins = array_values(
    array_filter(
        array_map(fn($u) => $u->refresh(), $users),
        fn($u) => $u->isActive() && $u->isAdmin()
    )
);
Enter fullscreen mode Exit fullscreen mode

Three problems packed into one expression.

Read order fights execution order. Your eyes hit array_values first, but it runs last. You parse the code in the opposite direction it executes, every time, forever.

Types collapse. array_filter returns array<int, User>. array_values returns array<int, User>. Try the same thing with a User|Customer union and a couple of generic helpers in the chain, and PHPStan starts shrugging.

The array_values is there to plug a hole in PHP itself. Filtering leaves index gaps. You forget the call once and your JSON output suddenly serializes as an object instead of an array because your indices went 0, 2, 5.

And then there are keys. PHP arrays don't really have keys, they have a sad approximation of them:

$a = ['1' => 'a'];
var_dump(array_keys($a)); // [0 => int(1)] - your "1" is now an int

$b = [true => 'x', 1 => 'y'];
count($b); // 1 - true and 1 collide

$c = [];
$c[$someUser] = 'admin'; // Fatal error: Illegal offset type
Enter fullscreen mode Exit fullscreen mode

You can't type-annotate these problems away. array<string, User> is a comfortable lie. PHP will happily put int keys in there and PHPStan can only believe whatever you wrote in the docblock.

What I built instead

Three real types: List, Set, Map. Each one in mutable, immutable, and lazy flavors. Full generics that flow through every method. Implementations are hidden behind interfaces, so swapping internals later is free.

Here's the same code, rewritten:

$activeAdmins = listOf($users)
    ->map(fn(User $u) => $u->refresh())
    ->filter(fn(User $u) => $u->isActive() && $u->isAdmin());
Enter fullscreen mode Exit fullscreen mode

That's the whole thing. Top to bottom in reading order, no plumbing call to fix indices, and PHPStan keeps ImmutableList<User> all the way through. If map() had narrowed the element type, you'd get that propagated too.

Maps stop lying about keys.

Objects work, no Fatal error:

$user = new User('Jesse');
$roles = mutableMapOf();
$roles[$user] = 'admin';
$roles[$user]; // 'admin'
Enter fullscreen mode Exit fullscreen mode

Default object hashing uses spl_object_id. Implement Hashable on your own classes when you want value-based identity instead of reference identity, which is what you usually want for value objects.

For mixed key types, you can't go through an array literal at all (PHP casts at the literal level, before any function sees it). mapOfPairs sidesteps that by taking pairs:

$flags = mapOfPairs([
    [true, 'enabled'],
    [1, 'one'],
    ['1', 'string-one'],
]);
$flags->count(); // 3 - PHP arrays would have collapsed all three
Enter fullscreen mode Exit fullscreen mode

When you specifically want a Map<string, V> and you're getting data from somewhere PHP has already mangled (a request, a DB row, a JSON decode), stringMapOf() is the recovery path:

$config = stringMapOf(['1' => 'enabled']);
$config->keys; // ImmutableSet {'1'} - cast back to string at construction
Enter fullscreen mode Exit fullscreen mode

There's a matching intMapOf() that goes the other direction and rejects anything that isn't an int. The factories enforce the key type at construction, so the analyzer and the runtime end up agreeing on array<string, V> (or array<int, V>) without you having to lie in a docblock.

Things you probably don't get from other libs

Mutable and immutable are separate types, not flags.
MutableList<T>::add() returns $this. ImmutableList<T>::add() returns a new instance and is annotated with #[NoDiscard], which becomes a real warning in PHP 8.5. No more silently throwing your "added" element into the void.

Map views are live collections.
$map->keys, $map->values, and $map->entries are real Set and List instances backed by the same underlying store. They share memory and they have the full collection API. So $map->values->sum() and $map->keys->sorted() just work, no copying.

Change tracking, only when you actually want it.

$tags = mutableSetOf(['php', 'kotlin']);
$result = $tags->tracked()->add('php');
$result->changed; // false - 'php' was already in the set
Enter fullscreen mode Exit fullscreen mode

I wrote this for cache invalidation logic and got tired of writing the "did this actually do anything" check by hand.

Lazy initialization via PHP 8.4 lazy objects.
Pass a closure to any factory and the data is materialized only when first accessed. Copy-on-write between mutable and immutable variants is virtually free for the common case where you don't mutate after converting.

A small PhpStorm plugin that fixes a couple of generic-inference quirks the IDE has with callbacks and __invoke. A few of the bugs I reported upstream while I was at it.

Where the design comes from

If the API feels familiar after using Kotlin or modern C#, that's intentional. Kotlin got the mutable/immutable split right, the read-only interfaces right, and the chained pipeline ergonomics right. Java laid down the foundational List/Set/Map vocabulary decades earlier. I borrowed from both, the FAQ walks through the specific differences if you want them.

PHPStan level 9 and Psalm strict, both clean. The generics carry through into your code, so your call chains stay typed end to end with no mixed returns to narrow.

Try it

composer require noctud/collection
Enter fullscreen mode Exit fullscreen mode

The default pin (^0.1.1) keeps you on 0.1.x patches only. BC breaks ship as 0.2 and composer won't auto-install them, so locking this way is safe through the 0.x cycle.

Docs and examples: https://noctud.dev
GitHub: https://github.com/noctud/collection

I'm planning a few 0.x releases through 2026 before locking the API for 1.0. Big remaining work is a Sequence type for lazy intermediate operations (similar to Kotlin sequences or Java streams), and a tests refactor.

If you try it on something real and the API gets in your way, I want to hear about it. Right now is when feedback actually shapes 1.0.

Top comments (8)

Collapse
 
xwero profile image
david duymelinck

To me it looks like you are trying to solve problems that are already fixed.

The nested function calls are solved by the pipe operator in PHP 8.5.
The functions you are using with the pipe operator tend to be pure functions, so the immutability of start data doesn't matter. It is just input.

Generics can be emulated by creating an array of objects.
For Immutability use readonly on the property or the class.

If you want a set you can use the SPLObjectStorage class. Not the best naming, but it does the trick.

The biggest use case for lazy collections is lots of data. For that you can use generators.

It doesn't come in a nice package like your library, But the fact that all the features are standalone makes it possible to bend them in all kinds of shapes. Where your library relies on a fixed set of shapes.

Collapse
 
delacry profile image
delacry • Edited

Thanks for the detailed read, but I see most of this differently.

Pipe operator helps with nested calls, but it only takes single-param callables, so most array functions need wrapping. And it doesn't fix that half of PHP's array functions return a result while the other half mutate in place (sort mutates and returns bool, array_filter returns but leaves index gaps). Types still collapse through a piped chain too.

array<string, User> is a docblock lie. Put '1' => $user in there and the key is int. PHPStan trusts what you wrote, PHP doesn't enforce it. The library makes the runtime and the static type actually agree.

readonly is a property modifier, not a collection type. There's no map/filter/reduce, no way to take an immutable collection as a parameter or return one from a function. To "add" to a readonly array property you rebuild the whole enclosing object by hand. ImmutableList<User> is a value type with a real API, and the type itself signals the contract to the caller and to PHPStan.

SplObjectStorage only takes objects, so no Set<string> or Set<int>. No map/filter/reduce/intersect/union. The API is from PHP 5.1 and it shows (attach/detach instead of add/remove). It works, it's just not what people mean today when they say "Set".

Small correction: the library doesn't have lazy collections yet, just lazy initialization via 8.4 lazy objects. Different thing. A Sequence type for lazy ops is on the roadmap.

On standalone primitives vs a coherent library: yes you can bend primitives into any shape, but everyone bends them differently in every codebase and the IDE can't help. Same argument applies to "just use PDO instead of Doctrine" or "just use the SPL queue instead of a job library". Sometimes the value of a library is the shared vocabulary and the consistency, not capabilities you couldn't technically build yourself.

Collapse
 
xwero profile image
david duymelinck • Edited

Yes I read it to fast. It is lazy initialization.
And you are right about SPLObjectStorage it is clunky but with an extended class you can make it developer friendly.

I fully agree the docblocks are an incomplete solution. It doesn't help at runtime.
What I was suggesting is;

readonly class UserItem implements Item
{
     public function __construct(string id, User $user){}
}

readlony class UserCollection implements Collection
{
   private array $items = [];

    public function __construct(UserItem ...$items) {
        $this->items = $items;
    }
}
Enter fullscreen mode Exit fullscreen mode

This way the collections are more specific than the collections in your library.

sort mutates and returns bool, array_filter returns but leaves index gaps

To solve it you can create pure functions.

namespace App\Collections

function sortKeys(Collection $collection): Collection
{
   $collectionType = $collection::class;
   $temp = $collection->toArray();

   usort($temp, fn(Item $a, Item $b) => $a->getKey() <=> $b->getKey());

   return new $collectionType(...$temp);   
}
Enter fullscreen mode Exit fullscreen mode

Pipe operator helps with nested calls, but it only takes single-param callables, so most array functions need wrapping

For now, in PHP 8.6 the wrapping will become minimal.
And with the function above the wrapping isn't needed.

Sometimes the value of a library is the shared vocabulary and the consistency, not capabilities you couldn't technically build yourself.

I agree with you that libraries can be useful. But with the examples above I showed it doesn't take a lot of effort to create a collections pattern of your own without the need of a dependency.

I did see you put a lot of effort in the library, and I applaud you. I'm just the weirdo that tries to keep as close to PHP as possible.

Thread Thread
 
delacry profile image
delacry

Appreciate the follow-up, and fair enough on staying close to PHP. That's a legit philosophy and a lot of good code that performs well gets written that way.

Small thing on the sortKeys example: with new $collectionType(...$temp) PHPStan loses the element type across the call (Collection in, Collection out). Solvable with templates and annotations, but it's the plumbing a generic collection library handles once.

And yeah, PFA v2 looks like it's passing, so 8.6 should cut down the wrapping. Personally not sure pipe + PFA is gonna read well in real codebases though. PHP's array stdlib wasn't designed for composition, and bolting it on with operators feels like it'll trade one kind of readability problem for another. We'll see how it shakes out once people use it on real code.

Either way, thanks for the back-and-forth.

Thread Thread
 
xwero profile image
david duymelinck

PHPStan loses the element type across the call (Collection in, Collection out).

The whole idea is to just don't care about PHPStan for collections.
While it is a good tool, it looks to me that a lot of developers put too much trust in it to solve their problems.

PHP's array stdlib wasn't designed for composition, and bolting it on with operators feels like it'll trade one kind of readability problem for another.

In what way do you think it can create a readability problem?
I do agree that the mix of by value and by reference functions is going to trip people up when using them in a function chain.

Thread Thread
 
delacry profile image
delacry

In what way do you think it can create a readability problem?

Quick example showing what I mean. Take "active admins, refreshed, sorted by name, top 10".

Pipe + PFA:

$topAdmins = $users
    |> array_filter(?, fn(User $u) => $u->isActive())
    |> array_map(fn(User $u) => $u->refresh(), ?)
    |> array_filter(?, fn(User $u) => $u->isAdmin())
    |> (fn($us) => (usort($us, fn(User $a, User $b) => $a->name <=> $b->name) ? $us : $us))
    |> array_slice(?, 0, 10);
Enter fullscreen mode Exit fullscreen mode

Method chaining:

$topAdmins = listOf($users)
    ->filter(fn(User $u) => $u->isActive())
    ->map(fn(User $u) => $u->refresh())
    ->filter(fn(User $u) => $u->isAdmin())
    ->sortedBy(fn(User $u) => $u->name)
    ->takeFirst(10);
Enter fullscreen mode Exit fullscreen mode

Placeholder flips position between array_filter and array_map because they have opposite argument order. usort mutates and returns bool so it needs the ternary wrapper. Method chaining sidesteps both - every step is a method on the same value.

That's the readability gap I meant. Bowing out here, good thread.

Thread Thread
 
xwero profile image
david duymelinck

I see how it could be confusing. For me the solution is a library with more consistent array functions than a collection library.

The pipe operator opened up a whole new way of thinking about data manipulation. With the builder pattern you are tied down to a single class. Where in a function chain the type can change depending on the actions that need to happen.
The builder pattern isn't bad, but it is overused.

Collapse
 
laura_ashaley_be356544300 profile image
Laura Ashaley

Nice direction type-safe collections in PHP improve reliability by reducing runtime surprises. It’s basically bringing more structure to what was historically a very loose array system.