Sharding is a form of scaling known as horizontal scaling where additional nodes are brought on to share the load. Horizontal scaling allows for near-limitless scalability to handle big data and intense workloads.
The xDB Collection database stores all contact and interaction data, and can experience high write rates from sources such as the Content Delivery role.
If you are using Sitecore xDB Collection SQL Provider since version 9 then you will have following:
- A shard map manager database that stores global information about the shard set
- A shard set of two or more shards
By default, Sitecore is installed with 2 shard databases, shard0 and shard1. This is ok and generally acceptable for smaller sites where you won't see large traffic and interaction numbers however, using only 2 shard databases can become a serious bottleneck in high performing sites.
Unfortunately, almost every Sitecore environment that I have had the opportunity to review is using the default 2 shard databases. This eventually creates performance issues and should be addressed as early as possible.
What are some of the possible issues?
- Each shard database can grow very large and become troublesome to maintain
- SQL Maintenance job on shard DB is unable to complete successfully or in reasonable time
- xConnect becoming unavailable and flooding logs with SQL timeout exceptions
- Overall site performance degradation
So what can you do?
There are couple of options:
- First option is the best one - just make sure you have sufficient number of shards deployed initially. You can have up to 18 shards so if you do currently have large traffic numbers on your site it is almost certain that 2 shards will not be enough. As a general rule I would recommend 2 shards for small, 4-6 for medium and 8 or more shards for large sites.
- Split or merge xDB Collection database shards. This can only be used with SQL Azure and allows you to move specified shard key range from shard A to shard B. I tried this personally and didn't really see any significant improvements. It is better to have more shards.
https://doc.sitecore.com/xp/en/developers/93/platform-administration-and-architecture/split-or-merge-xdb-collection-database-shards.html
One more thing
One of the most common reasons for xConnect and SQL database performance issues are contacts with an excessive number of interactions. This can cause high resource consumption and HTTP server errors because the requests take a long time to execute. You can analyze all contact data in shard databases to find contact with excessive amount of interactions. Typically more than 1,000 interactions per contact is considered large. You can read more about how to analyze the data and how to fix it here https://support.sitecore.com/kb?id=kb_article_view&sysparm_article=KB0417184
Thank you for reading this article :)