What is an Elasticsearch document?

In Elasticsearch, a document is the basic unit of data that can be indexed and searched. A document is a JSON object that contains a set of key-value pairs representing the document’s fields and their values.

Each document in Elasticsearch is stored in an index, which is a collection of documents that have similar characteristics. An index can have one or more primary shards, which are responsible for storing a subset of the data in the index, and each primary shard can have one or more replica shards, which are copies of the primary shard that are stored on other nodes in the Elasticsearch cluster.

Documents in Elasticsearch can have different types of fields, such as string, integer, and date fields. Fields can also have additional properties that define how the field is indexed and searched, such as whether it should be analyzed for full-text search or not.

When a document is indexed in Elasticsearch, it is stored in a primary shard based on the document ID and the hashing algorithm used by Elasticsearch. The document is then replicated to the replica shards to provide fault tolerance and high availability of data.

Documents in Elasticsearch can be searched using a variety of query types, such as term queries, phrase queries, and full-text queries. Elasticsearch provides a powerful search and aggregation engine that can be used to search and analyze large amounts of data.

Overall, documents are a core concept in Elasticsearch that provide a way to organize and store data in a way that allows for efficient indexing and searching, as well as scalability and fault tolerance.