Docs Menu
Docs Home
/ / /
PyMongo Driver

Databases and Collections

On this page

  • Overview
  • Access a Database
  • Access a Collection
  • Create a Collection
  • Time Series Collection
  • Capped Collection
  • Get a List of Collections
  • Delete a Collection
  • Type Hints
  • Database
  • Collection
  • Troubleshooting
  • Client Type Annotations
  • Incompatible Type
  • AutoReconnect Error
  • API Documentation

In this guide, you can learn how to use MongoDB databases and collections with PyMongo.

MongoDB organizes data into a hierarchy of the following levels:

  • Databases: The top level of data organization in a MongoDB instance.

  • Collections: MongoDB stores documents in collections. They are analogous to tables in relational databases.

  • Documents: Contain literal data such as string, numbers, dates, and other embedded documents.

For more information about document field types and structure, see the Documents guide in the MongoDB Server manual.

Access a database by using dictionary-style access on your MongoClient instance.

The following example accesses a database named test_database:

database = client["test_database"]

Access a collection by using dictionary-style access on an instance of your database.

The following example accesses a collection named test_collection:

database = client["test_database"]
collection = database["test_collection"]

Tip

If the provided collection name does not already exist in the database, MongoDB implicitly creates the collection when you first insert data into it.

Use the create_collection() method to explicitly create a collection in a MongoDB database.

The following example creates a collection called example_collection:

database = client["test_database"]
database.create_collection("example_collection")

You can specify collection options, such as maximum size and document validation rules, by passing them in as keyword arguments. For a full list of optional parameters, see the create_collection() API documentation.

Time series collections efficiently store sequences of measurements over a period of time. The following example creates a time series collection called example_ts_collection in which the documents' time field is called timestamp:

database = client["test_database"]
database.create_collection("example_ts_collection", timeseries={"timeField": "timestamp"})

For more information about using time series data with PyMongo, see the Time Series Data guide.

You can create a capped collection that cannot grow beyond a specified memory size or document count. The following example creates a capped collection called example_capped_collection that has a maximum size of 1000 bytes:

database = client["test_database"]
database.create_collection("example_capped_collection", capped=True, size=1000)

To learn more about capped collections, see Capped Collections in the MongoDB Server manual.

You can query for a list of collections in a database by calling the list_collections() method. The method returns a cursor containing all collections in the database and their associated metadata.

The following example calls the list_collections() method and iterates over the cursor to print the results:

collection_list = database.list_collections()
for c in collection_list:
print(c)

To query for only the names of the collections in the database, call the list_collection_name() method as follows:

collection_list = database.list_collection_names()
for c in collection_list:
print(c)

For more information about iterating over a cursor, see Access Data From a Cursor.

You can delete a collection from the database by using the drop_collection() method.

The following example deletes the test_collection collection:

collection = database["test_collection"];
collection.drop();

Warning

Dropping a Collection Deletes All Data in the Collection

Dropping a collection from your database permanently deletes all documents and all indexes within that collection.

Drop a collection only if the data in it is no longer needed.

If your application uses Python 3.5 or later, you can add type hints, as described in PEP 484, to your code. Type hints denote the data types of variables, parameters, and function return values, and the structure of documents. Some IDEs can use type hints to check your code for type errors and suggest appropriate options for code completion.

Note

TypedDict in Python 3.7 and Earlier

The TypedDict class is in the typing module, which is available only in Python 3.8 and later. To use the TypedDict class in earlier versions of Python, install the typing_extensions package.

If all documents in a database match a well-defined schema, you can specify a type hint that uses a Python class to represent the documents' structure. By including this class in the type hint for your Database object, you can ensure that all documents you store or retrieve have the required structure. This provides more accurate type checking and code completion than the default Dict[str, Any] type.

First, define a class to represent a document from the database. The class must inherit from the TypedDict class and must contain the same fields as the documents in the database. After you define your class, include its name as the generic type for the Database type hint.

The following example defines a Movie class and uses it as the generic type for a Database type hint:

from typing import TypedDict
from pymongo import MongoClient
from pymongo.database import Database
class Movie(TypedDict):
name: str
year: int
client: MongoClient = MongoClient()
database: Database[Movie] = client["test_database"]

Adding a generic type to a Collection type hint is similar to adding a generic type to a Database type hint. First, define a class that inherits from the TypedDict class and represents the structure of the documents in the collection. Then, include the class name as the generic type for the Collection type hint, as shown in the following example:

from typing import TypedDict
from pymongo import MongoClient
from pymongo.collection import Collection
class Movie(TypedDict):
name: str
year: int
client: MongoClient = MongoClient()
database = client["test_database"]
collection: Collection[Movie] = database["test_collection"]

If you don't add a type annotation for your MongoClient object, your type checker might show an error similar to the following:

from pymongo import MongoClient
client = MongoClient() # error: Need type annotation for "client"

The solution is to annotate the MongoClient object as client: MongoClient or client: MongoClient[Dict[str, Any]].

If you specify MongoClient as a type hint but don't include data types for the document, keys, and values, your type checker might show an error similar to the following:

error: Dict entry 0 has incompatible type "str": "int";
expected "Mapping[str, Any]": "int"

The solution is to add the following type hint to your MongoClient object:

``client: MongoClient[Dict[str, Any]]``

You receive this error if you specify tag-sets in your read preference and MongoDB is unable to find replica set members with the specified tags. To avoid this error, include an empty dictionary ({}) at the end of the tag-set list. This instructs PyMongo to read from any member that matches the read-reference mode when it can't find matching tags.

To learn more about any of the methods or types discussed in this guide, see the following API documentation:

Back

Connection Pools