sh.shardAndDistributeCollection()
Definition
sh.shardAndDistributeCollection(namespace, key, unique, options)
Shards a collection and immediately redistributes the data using the provided shard key. The immediate redistribution of data allows for faster data movement and reduced impact to workloads.
Important
mongosh Method
This page documents a
mongosh
method. This is not the documentation for a language-specific driver, such as Node.js.For MongoDB API drivers, refer to the language-specific MongoDB driver documentation.
Running
sh.shardAndDistributeCollection()
inmongosh
has the same result as consecutively running theshardCollection
andreshardCollection
commands.
Parameters
sh.shardAndDistributeCollection()
takes the following parameters:
Parameter | Type | Necessity | Description |
---|---|---|---|
namespace | String | Required | The namespace of the collection to shard in the format
"<database>.<collection>" . |
key | Document | Required | The document that specifies the field or fields to use as the shard key.
Set the field value to either:
|
unique | Boolean | Optional | Specify When using hashed shard keys, you can't
specify |
options | Document | Optional | A document containing optional fields, including
numInitialChunks and collation . |
The options
argument supports the following options:
Parameter | Type | Description |
---|---|---|
numInitialChunks | Integer | Specifies the initial number of chunks to create across all shards in
the cluster when sharding and resharding a collection. MongoDB then
creates and balances chunks across the cluster. The
numInitialChunks parameter must result in less than 8192 per
shard. Defaults to 1000 chunks per shard. |
collation | Document | If the collection specified to shardAndDistributeCollection()
has a default collation, you must include a
collation document with { locale : "simple" } , or the
shardAndDistributeCollection() method fails. |
presplitHashedZones | Boolean | Specify
|
timeseries | Document | Specify this option to create a new sharded time series collection. To shard an existing time series collection, omit this parameter. When the collection specified to For detailed syntax, see Time Series Options. |
Considerations
The following factors can impact performance or the distribution of your data.
Shard Keys
Although you can change your shard key later, carefully consider your shard key choice to optimize scalability and perfomance.
Shard Keys on Time Series Collections
When sharding time series collections, you can only specify the following fields in the shard key:
The
metaField
Sub-fields of
metaField
The
timeField
You may specify combinations of these fields in the shard key. No other
fields, including _id
, are allowed in the shard key pattern.
When you specify the shard key:
metaField
can be either a:timeField
must be:At the end of the shard key pattern
Tip
Avoid specifying only the timeField
as the shard key. Since
the timeField
increases monotonically, it may result in all writes appearing on a
single chunk within the cluster. Ideally, data is evenly distributed
across chunks.
To learn how to best choose a shard key, see:
Hashed Shard Keys
Hashed shard keys use a hashed index or a compound hashed index as the shard key.
To specify a hashed shard key field, use field: "hashed"
.
Note
If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.
Zone Sharding and Initial Chunk Distribution
The shard collection operation (i.e. shardCollection
command and the sh.shardCollection()
helper) can perform
initial chunk creation and distribution for an empty or a
non-existing collection if zones and zone ranges have been defined for the collection. Initial
chunk distribution allows for a faster setup of zoned sharding.
After the initial distribution, the balancer manages the chunk
distribution going forward per usual.
For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection. If sharding a
collection using a ranged or single-field hashed shard key, the
numInitialChunks
option has no effect if zones and zone ranges have
been defined for the empty collection.
To shard a collection using a compound hashed index, see Initial Chunk Distribution with Compound Hashed Indexes.
Initial Chunk Distribution with Compound Hashed Indexes
MongoDB supports sharding collections on compound hashed indexes. When sharding an empty or non-existing collection using a compound hashed shard key, additional requirements apply in order for MongoDB to perform initial chunk creation and distribution.
The numInitialChunks
option has no effect if zones and zone ranges
have been defined for the empty collection and
presplitHashedZones
is false
.
For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection.
Uniqueness
If you specify unique: true
, you must create the index
before using sh.shardAndDistributeCollection()
.
Although you can have a unique compound index where the shard
key is a prefix, if you use the unique
parameter, the collection must have a unique index that is on the shard
key.
Collation
If the collection has a default collation,
the sh.shardAndDistributeCollection
command must include a collation
parameter with the
value { locale: "simple" }
. For non-empty collections with a
default collation, you must have at least one index with the simple
collation whose fields support the shard key pattern.
You do not need to specify the collation
option for collections
without a collation. If you do specify the collation option for
a collection with no collation, it will have no effect.
Write Concern
mongos
uses "majority"
for the
write concern of the
shardCollection
command, its helper
sh.shardCollection()
, and the
sh.shardAndDistributeCollection()
method.
Examples
The following examples show how you can use the
sh.shardAndDistributeCollection()
method with or without optional
parameters.
Simple Usage
A database named records
contains a collection named people
. The
following command shards the collection by the zipcode
field and
immediately redistributes the data in the records.people
collection:
sh.shardAndDistributeCollection("records.people", { zipcode: 1 } )
Usage with Options
The phonebook
database has a contacts
collection with no
default collation. The
following example uses sh.shardAndDistributeCollection()
to shard and
redistribute the phonebook.contacts
collection with:
A Hashed shard key on the
last_name
field.5
initial chunks.simple
collation.
sh.shardAndDistributeCollection( "phonebook.contacts", { last_name: "hashed" }, false, { numInitialChunks: 5, collation: { locale: "simple" } } )