diff --git a/doc/extractive/training.rst b/doc/extractive/training.rst index 2446579..adc5b23 100644 --- a/doc/extractive/training.rst +++ b/doc/extractive/training.rst @@ -8,6 +8,15 @@ Details Once the dataset has been converted to the extractive task, it can be used as input to a :class:`data.SentencesProcessor`, which has a :meth:`~data.SentencesProcessor.add_examples()` function to add sets of ``(example, labels)`` and a :meth:`~data.SentencesProcessor.get_features()` function that processes the data and prepares it to be inputted into the model (``input_ids``, ``attention_masks``, ``labels``, ``token_type_ids``, ``sent_rep_token_ids``, ``sent_rep_token_ids_masks``). Feature extraction runs in parallel and tokenizes text using the tokenizer appropriate for the model specified with ``--model_name_or_path``. The tokenizer can be changed to another ``huggingface/transformers`` tokenizer with the ``--tokenizer_name`` option. +.. important:: When loading a pre-trained model you may encounter this common error: + + .. code-block:: + + RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer: + Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids". + + To solve this issue, set ``strict=False`` like so: ``model = ExtractiveSummarizer.load_from_checkpoint("distilroberta-base-ext-sum.ckpt", strict=False)``. If you are using the ``main.py`` script, then you can alternatively sepcify the ``--no_strict`` option. + For the :ref:`CNN/DM dataset `, to train a model for 50,000 steps on the data run: .. code-block:: bash