Skip to main content

Serialization and deserialization without blowing React

Let’s say you have a string source of data. You need to read and write from it. You don’t really have much to do with string so you’d deserialize it and use JSON.parse(…) and when you need to serialize it again, you’d use JSON.stringify(…).

It might not sound much, but when you’re dealing with complex objects combined with an immutable data based system, it might lead to unintended re-rendering cycles, and can cause full page re-renders for data that wasn’t actually changed. This specifically was a major performance issue we had to fix where I work.

This isn’t an esoteric issue, since popular browser APIs such as history, localStorage force you to use strings while we often work with objects.

Until we reached a solution we are happy with, we went through a few phases. I think most developers would go a similar path.

The naive solution

When approaching this problem, the naive solution is something as:

function useData() {
	// Reading from the source the data "as is" (string)
	const rawData = useSyncExternalStore(source.subscribe, () => source.value);

	// We parse data in `useMemo`, because `useSyncExternalStore` will call
	// `() => source.value` each render, and we want to make sure it doesn't
	// return a fresh instance each render
	const data = useMemo(() => JSON.parse(rawData), [rawData]);

	const setData = useCallback((data) => {
		// We need to convert `object` to `string` when saving to the source
		source.setValue(JSON.stringify(data));
	}, []);

	return [data, setData];
}

And the usage of this hook might look like:

const [data, setData] = useData();

return (
	<button
		onClick={() => {
			setData({ ...data, count: data.count + 1 });
		}}
	>
		{data.count}
	</button>
);

This solution might be good enough for many cases, but it has some limitations.

For example, if you have 100 components that are based on different parts of the data, you might suffer from ergonomics. So you might change the implementation to something as:

// Now we ask for a `key` to create more ergonomic hook
function useData(key) {
	const rawData = useSyncExternalStore(source.subscribe, () => source.value);

	const value = useMemo(() => JSON.parse(rawData)[key], [rawData, key]);

	const setValue = useCallback(
		(value) => {
			source.setValue(
				JSON.stringify({
					...JSON.parse(source.value),
					[key]: value,
				}),
			);
		},
		[key],
	);

	return [value, setValue];
}

And the usage will be turned to something simpler as:

// We are explicit about which part we care about
const [count, setCount] = useData("count");

return (
	<button
		onClick={() => {
			setCount(count + 1);
		}}
	>
		{data.count}
	</button>
);

This solution might be more ergonomic, but it still has some issues. If you have 100 components that are connected to the source:

  1. it means you will call JSON.parse(…) implicitly 100 times, which is expensive!
  2. it also means you hold in memory 100 instances of identical data

Single source for all

The solution for this issue is moving to a single source for all the deserialized state. So we only deserialize once, and save only a single instance for all usages. To make it, we need to wrap the source. For the sake of simplicity of the example, I didn’t make an actual wrapper, but added a variable and emitter that will replace the “source” inside the hook. So now, our implementation looks as followed:

const emitter = new Emitter();

// This is going be our single source of truth for the whole hooks usages
let currentValue = JSON.parse(source.value);

source.subscribe(() => {
	// We save deserialized only once for all subscriptions
	currentValue = JSON.parse(source.value);
	// We emit to hooks that there was a change
	emitter.emit();
});

function useData(key) {
	const value = useSyncExternalStore(
		// We subscribe to the emitter instead of the source
		emitter.subscribe,
		() => currentValue[key],
	);

	const setValue = useCallback(
		(value) => {
			source.setValue(
				JSON.stringify({
					...currentValue,
					[key]: value,
				}),
			);
		},
		[key],
	);

	return [value, setValue];
}

We have some improvement, but now our issue is what happens when the values of the object aren’t primitives. (Values that have a reference. For example: array and object).

If we have structure such as this:

{ "a": [], "b": [] }

It means that every time we get updates from the source and use JSON.parse(…) we get a whole new object. Which means, we updated a but all places that are subscribed to b will be updated because b is a new reference, even if its data stayed the same. This might cause sometimes even a full page re-render.

Using serialization as write-mode only

To solve this, the naive solution is just ignoring updates, and moving to write-only mode. So, we might read from the source on loading, but from there, we only write and never read again.

So now, after page load currentValue will only change during setValue, and it will be modified directly. It means that if b changes, a will remain with the same reference.

const emitter = new Emitter();

// The only time we'll read from `source.value`
let currentValue = JSON.parse(source.value);

// We removed source subscription, as we're going write-only mode

function useData(key) {
	const value = useSyncExternalStore(
		emitter.subscribe,
		() => currentValue[key],
	);

	const setValue = useCallback(
		(value) => {
			// We modify `currentValue` directly in a way, React will have a new
			// snapshot, and making sure other properties won't be affected by the
			// update
			currentValue = {
				...currentValue,
				[key]: value,
			};

			// We make sure the source is updated
			source.setValue(JSON.stringify(currentValue));

			// We notify all other hooks for the change
			emitter.emit();
		},
		[key],
	);

	return [value, setValue];
}

We solved one issue, but we got another bigger issue. The source might be updated externally, and we need to choose between, ignoring updates, or updating and losing the optimization. But there is a good solution…

Structural sharing

What if we could create a new object that all of its unchanged parts preserve the old references? In Solid we have a nice utility called reconcile that works with mutable data. Sadly, React doesn’t have something similar for immutable data, but TanStack Query did exactly this inside their code. It’s called replaceEqualDeep and I recommend you take a look at its code.

The way replaceEqualDeep works is by recursively traversing both the old and the new objects. It goes from the leaves to the root, and checks if data is changed. If it is structurally equal it will return the old data. If it’s not structurally equal it will merge old data with new data to a new object. This way it achieves structure sharing, and old data will be preserved and left unchanged.

It means that places that weren’t changed, won’t be re-rendered. From React perspective nothing has changed. No re-render is needed.

The complete solution is:

const emitter = new Emitter();

let currentValue = JSON.parse(source.value);

source.subscribe(() => {
	// We replace only the parts that were changed with new data
	currentValue = replaceEqualDeep(currentValue, JSON.parse(source.value));
	emitter.emit();
});

function useData(key) {
	const value = useSyncExternalStore(
		emitter.subscribe,
		() => currentValue[key],
	);

	const setValue = useCallback(
		(value) => {
			source.setValue(
				JSON.stringify({
					...currentValue,
					[key]: value,
				}),
			);
		},
		[key],
	);

	return [value, setValue];
}

Conclusion

With structural sharing we’ve fixed our performance issue, and the freedom to update the source out of our own wrapper. We’re also getting extra benefits. Now, we can also have selectors, and we aren’t limited to a top-level property.

I’ve ignored issues related to useSyncExternalStore and lack of support for concurrent rendering as it’s not what the post is about.

I hope you understand the power of structural sharing, but it should be noted that there are more things to do, which I didn’t write in this post. For example, separating “get” and “set” hooks, as if we only need “set” we don’t need to subscribe to changes we don’t care about. In React one of the keys for better performance is making sure React needs to do less work. Not subscribing to changes, and structural sharing are part of the ways of getting there. We’ve managed to boost performance, and cut up to 50% of rendering time in some of the pages by these techniques.