diff --git a/source/includes/data-source.rst b/source/includes/data-source.rst deleted file mode 100644 index 2f18028..0000000 --- a/source/includes/data-source.rst +++ /dev/null @@ -1,5 +0,0 @@ -.. note:: - - The empty argument ("") refers to a file to use as a data source. - In this case our data source is a MongoDB collection, so the data - source argument is empty. \ No newline at end of file diff --git a/source/includes/scala-java-explicit-schema.rst b/source/includes/scala-java-explicit-schema.rst deleted file mode 100644 index 3b682cb..0000000 --- a/source/includes/scala-java-explicit-schema.rst +++ /dev/null @@ -1,13 +0,0 @@ -By default, reading from MongoDB in a ``SparkSession`` infers the -schema by sampling documents from the collection. You can also use a -|class| to define the schema explicitly, thus removing the extra -queries needed for sampling. - -.. note:: - - If you provide a case class for the schema, MongoDB returns **only - the declared fields**. This helps minimize the data sent across the - wire. - -The following statement creates a ``Character`` |class| and then -uses it to define the schema for the DataFrame: