sh.shardAndDistributeCollection()

On this page

Definition

Parameters
Considerations
Examples
Learn More

Definition

sh.shardAndDistributeCollection(namespace, key, unique, options)

Shards a collection and immediately redistributes the data using the provided shard key. The immediate redistribution of data allows for faster data movement and reduced impact to workloads.

Important

mongosh Method

This page documents a mongosh method. This is not the documentation for a language-specific driver, such as Node.js.

For MongoDB API drivers, refer to the language-specific MongoDB driver documentation.

Running sh.shardAndDistributeCollection() in mongosh has the same result as consecutively running the shardCollection and reshardCollection commands.

Parameters

sh.shardAndDistributeCollection() takes the following parameters:

Parameter	Type	Necessity	Description
`namespace`	String	Required	The namespace of the collection to shard in the format `"<database>.<collection>"`.
`key`	Document	Required	The document that specifies the field or fields to use as the shard key. `{ <field1>: <1\|"hashed">, ... }` Set the field value to either: `1` for range-based sharding `"hashed"` to specify a hashed shard key. Shard keys must be supported by an index. The index must exist before you run the `shardAndDistributeCollection()` method. Tip See also: Shard Key Indexes
`unique`	Boolean	Optional	Specify `true` to ensure that the underlying index enforces a unique constraint. Defaults to `false`. When using hashed shard keys, you can't specify `true`.
`options`	Document	Optional	A document containing optional fields, including `numInitialChunks` and `collation`.

The options argument supports the following options:

Parameter	Type	Description
`numInitialChunks`	Integer	Specifies the initial number of chunks to create across all shards in the cluster when sharding and resharding a collection. MongoDB then creates and balances chunks across the cluster. The `numInitialChunks` parameter must result in less than `8192` per shard. Defaults to `1000` chunks per shard.
`collation`	Document	If the collection specified to `shardAndDistributeCollection()` has a default collation, you must include a collation document with `{ locale : "simple" }`, or the `shardAndDistributeCollection()` method fails.
`presplitHashedZones`	Boolean	Specify `true` to perform initial chunk creation and distribution for an empty or non-existing collection based on the defined zones and zone ranges for the collection. For hashed sharding only. `shardAndDistributeCollection()` with `presplitHashedZones: true` returns an error if any of the following are true: The shard key does not contain a hashed field (i.e. is not a single field hashed index or compound hashed index). The collection has no defined zones or zone ranges. The defined zone ranges do not meet the requirements.
`timeseries`	Document	Specify this option to create a new sharded time series collection. To shard an existing time series collection, omit this parameter. When the collection specified to `shardAndDistributeCollection` is a time series collection and the `timeseries` option is not specified, MongoDB uses the values that define the existing time series collection to populate the `timeseries` field. For detailed syntax, see Time Series Options.

Considerations

The following factors can impact performance or the distribution of your data.

Shard Keys

Although you can change your shard key later, carefully consider your shard key choice to optimize scalability and perfomance.

Shard Keys on Time Series Collections

When sharding time series collections, you can only specify the following fields in the shard key:

The metaField
Sub-fields of metaField
The timeField

You may specify combinations of these fields in the shard key. No other fields, including _id, are allowed in the shard key pattern.

When you specify the shard key:

metaField can be either a:
- Hashed shard key
- Ranged shard key
timeField must be:
- A ranged shard key
- At the end of the shard key pattern

Tip

Avoid specifying only the timeField as the shard key. Since the timeField increases monotonically, it may result in all writes appearing on a single chunk within the cluster. Ideally, data is evenly distributed across chunks.

To learn how to best choose a shard key, see:

Tip

Hashed Shard Keys

Hashed shard keys use a hashed index or a compound hashed index as the shard key.

To specify a hashed shard key field, use field: "hashed" .

Note

If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.

Tip

Zone Sharding and Initial Chunk Distribution

The shard collection operation (i.e. shardCollection command and the sh.shardCollection() helper) can perform initial chunk creation and distribution for an empty or a non-existing collection if zones and zone ranges have been defined for the collection. Initial chunk distribution allows for a faster setup of zoned sharding. After the initial distribution, the balancer manages the chunk distribution going forward per usual.

For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection. If sharding a collection using a ranged or single-field hashed shard key, the numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection.

To shard a collection using a compound hashed index, see Initial Chunk Distribution with Compound Hashed Indexes.

Initial Chunk Distribution with Compound Hashed Indexes

MongoDB supports sharding collections on compound hashed indexes. When sharding an empty or non-existing collection using a compound hashed shard key, additional requirements apply in order for MongoDB to perform initial chunk creation and distribution.

The numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection and presplitHashedZones is false.

For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection.

Tip

Uniqueness

If you specify unique: true, you must create the index before using sh.shardAndDistributeCollection().

Although you can have a unique compound index where the shard key is a prefix, if you use the unique parameter, the collection must have a unique index that is on the shard key.

Tip

Write Concern

mongos uses "majority" for the write concern of the shardCollection command, its helper sh.shardCollection(), and the sh.shardAndDistributeCollection() method.

Examples

The following examples show how you can use the sh.shardAndDistributeCollection() method with or without optional parameters.

Simple Usage

A database named records contains a collection named people. The following command shards the collection by the zipcode field and immediately redistributes the data in the records.people collection:

sh.shardAndDistributeCollection("records.people", { zipcode: 1 } )

Usage with Options

The phonebook database has a contacts collection with no default collation. The following example uses sh.shardAndDistributeCollection() to shard and redistribute the phonebook.contacts collection with:

A Hashed shard key on the last_name field.
5 initial chunks.
simple collation.

sh.shardAndDistributeCollection(
  "phonebook.contacts",
  { last_name: "hashed" },
  false,
  {
    numInitialChunks: 5,
    collation: { locale: "simple" }
  }
)

Learn More

Back

sh.setBalancerState

sh.shardCollection

Definition

Important

mongosh Method

Parameters

Tip

See also:

Considerations

Shard Keys

Shard Keys on Time Series Collections

Tip

Tip

See also:

Hashed Shard Keys

Note

Tip

See also:

Zone Sharding and Initial Chunk Distribution

Initial Chunk Distribution with Compound Hashed Indexes

Tip

See also:

Uniqueness

Tip

See also:

Collation

Write Concern

Examples

Simple Usage

Usage with Options

Learn More