Serialization and deserialization without blowing React
Let’s say you have a string
source of data. You need to read and write from it. You don’t really have much to do with string
so you’d deserialize it and use JSON.parse(…)
and when you need to serialize it again, you’d use JSON.stringify(…)
.
It might not sound much, but when you’re dealing with complex objects combined with an immutable data based system, it might lead to unintended re-rendering cycles, and can cause full page re-renders for data that wasn’t actually changed. This specifically was a major performance issue we had to fix where I work.
This isn’t an esoteric issue, since popular browser APIs such as history
, localStorage
force you to use string
s while we often work with object
s.
Until we reached a solution we are happy with, we went through a few phases. I think most developers would go a similar path.
The naive solution
When approaching this problem, the naive solution is something as:
function useData() {
// Reading from the source the data "as is" (string)
const rawData = useSyncExternalStore(source.subscribe, () => source.value);
// We parse data in `useMemo`, because `useSyncExternalStore` will call
// `() => source.value` each render, and we want to make sure it doesn't
// return a fresh instance each render
const data = useMemo(() => JSON.parse(rawData), [rawData]);
const setData = useCallback((data) => {
// We need to convert `object` to `string` when saving to the source
source.setValue(JSON.stringify(data));
}, []);
return [data, setData];
}
And the usage of this hook might look like:
const [data, setData] = useData();
return (
<button
onClick={() => {
setData({ ...data, count: data.count + 1 });
}}
>
{data.count}
</button>
);
This solution might be good enough for many cases, but it has some limitations.
For example, if you have 100 components that are based on different parts of the data, you might suffer from ergonomics. So you might change the implementation to something as:
// Now we ask for a `key` to create more ergonomic hook
function useData(key) {
const rawData = useSyncExternalStore(source.subscribe, () => source.value);
const value = useMemo(() => JSON.parse(rawData)[key], [rawData, key]);
const setValue = useCallback(
(value) => {
source.setValue(
JSON.stringify({
...JSON.parse(source.value),
[key]: value,
}),
);
},
[key],
);
return [value, setValue];
}
And the usage will be turned to something simpler as:
// We are explicit about which part we care about
const [count, setCount] = useData("count");
return (
<button
onClick={() => {
setCount(count + 1);
}}
>
{data.count}
</button>
);
This solution might be more ergonomic, but it still has some issues. If you have 100 components that are connected to the source:
- it means you will call
JSON.parse(…)
implicitly 100 times, which is expensive! - it also means you hold in memory 100 instances of identical data
Single source for all
The solution for this issue is moving to a single source for all the deserialized state. So we only deserialize once, and save only a single instance for all usages. To make it, we need to wrap the source. For the sake of simplicity of the example, I didn’t make an actual wrapper, but added a variable and emitter that will replace the “source” inside the hook. So now, our implementation looks as followed:
const emitter = new Emitter();
// This is going be our single source of truth for the whole hooks usages
let currentValue = JSON.parse(source.value);
source.subscribe(() => {
// We save deserialized only once for all subscriptions
currentValue = JSON.parse(source.value);
// We emit to hooks that there was a change
emitter.emit();
});
function useData(key) {
const value = useSyncExternalStore(
// We subscribe to the emitter instead of the source
emitter.subscribe,
() => currentValue[key],
);
const setValue = useCallback(
(value) => {
source.setValue(
JSON.stringify({
...currentValue,
[key]: value,
}),
);
},
[key],
);
return [value, setValue];
}
We have some improvement, but now our issue is what happens when the values of the object aren’t primitives. (Values that have a reference. For example: array
and object
).
If we have structure such as this:
{ "a": [], "b": [] }
It means that every time we get updates from the source and use JSON.parse(…)
we get a whole new object. Which means, we updated a
but all places that are subscribed to b
will be updated because b
is a new reference, even if its data stayed the same. This might cause sometimes even a full page re-render.
Using serialization as write-mode only
To solve this, the naive solution is just ignoring updates, and moving to write-only mode. So, we might read from the source on loading, but from there, we only write and never read again.
So now, after page load currentValue
will only change during setValue
, and it will be modified directly. It means that if b
changes, a
will remain with the same reference.
const emitter = new Emitter();
// The only time we'll read from `source.value`
let currentValue = JSON.parse(source.value);
// We removed source subscription, as we're going write-only mode
function useData(key) {
const value = useSyncExternalStore(
emitter.subscribe,
() => currentValue[key],
);
const setValue = useCallback(
(value) => {
// We modify `currentValue` directly in a way, React will have a new
// snapshot, and making sure other properties won't be affected by the
// update
currentValue = {
...currentValue,
[key]: value,
};
// We make sure the source is updated
source.setValue(JSON.stringify(currentValue));
// We notify all other hooks for the change
emitter.emit();
},
[key],
);
return [value, setValue];
}
We solved one issue, but we got another bigger issue. The source might be updated externally, and we need to choose between, ignoring updates, or updating and losing the optimization. But there is a good solution…
Structural sharing
What if we could create a new object that all of its unchanged parts preserve the old references? In Solid we have a nice utility called reconcile
that works with mutable data. Sadly, React doesn’t have something similar for immutable data, but TanStack Query did exactly this inside their code. It’s called replaceEqualDeep
and I recommend you take a look at its code.
The way replaceEqualDeep
works is by recursively traversing both the old and the new objects. It goes from the leaves to the root, and checks if data is changed. If it is structurally equal it will return the old data. If it’s not structurally equal it will merge old data with new data to a new object. This way it achieves structure sharing, and old data will be preserved and left unchanged.
It means that places that weren’t changed, won’t be re-rendered. From React perspective nothing has changed. No re-render is needed.
The complete solution is:
const emitter = new Emitter();
let currentValue = JSON.parse(source.value);
source.subscribe(() => {
// We replace only the parts that were changed with new data
currentValue = replaceEqualDeep(currentValue, JSON.parse(source.value));
emitter.emit();
});
function useData(key) {
const value = useSyncExternalStore(
emitter.subscribe,
() => currentValue[key],
);
const setValue = useCallback(
(value) => {
source.setValue(
JSON.stringify({
...currentValue,
[key]: value,
}),
);
},
[key],
);
return [value, setValue];
}
Conclusion
With structural sharing we’ve fixed our performance issue, and the freedom to update the source out of our own wrapper. We’re also getting extra benefits. Now, we can also have selectors, and we aren’t limited to a top-level property.
I’ve ignored issues related to useSyncExternalStore
and lack of support for concurrent rendering as it’s not what the post is about.
I hope you understand the power of structural sharing, but it should be noted that there are more things to do, which I didn’t write in this post. For example, separating “get” and “set” hooks, as if we only need “set” we don’t need to subscribe to changes we don’t care about. In React one of the keys for better performance is making sure React needs to do less work. Not subscribing to changes, and structural sharing are part of the ways of getting there. We’ve managed to boost performance, and cut up to 50% of rendering time in some of the pages by these techniques.