Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use "local" computing in BERT_Eval_SQUAD.ipynb #42

Open
jsrimr opened this issue Nov 3, 2019 · 0 comments
Open

Can't use "local" computing in BERT_Eval_SQUAD.ipynb #42

jsrimr opened this issue Nov 3, 2019 · 0 comments

Comments

@jsrimr
Copy link

jsrimr commented Nov 3, 2019

Hello,
since I am only free tier, I cannot use the gpu-cluster in Azure.
So I modified AzureML-BERT/finetune/PyTorch/notebooks/BERT_Eval_SQUAD.ipynb script a little like below

from azureml.train.dnn import PyTorch
from azureml.core.runconfig import RunConfiguration
from azureml.core.container_registry import ContainerRegistry

run_user_managed = RunConfiguration()
run_user_managed.environment.python.user_managed_dependencies = True

# Define custom Docker image info
image_name = 'mcr.microsoft.com/azureml/bert:pretrain-openmpi3.1.2-cuda10.0-cudnn7-ubuntu16.04'

estimator = PyTorch(source_directory='../../../',
                    compute_target="local",
                     #Docker image
                    use_docker=True,
                    custom_docker_image=image_name,
                    user_managed=True,
                    script_params = {
                          '--bert_model':'bert-large-uncased',
                          "--model_file_location": checkpoint_path,
                          '--model_file': 'bert_encoder_epoch_245.pt',
                          '--do_train' : '',
                          '--do_predict': '',
                          '--train_file': train_path,
                          '--predict_file': dev_path,
                          '--max_seq_length': 512,
                          '--train_batch_size': 8,
                          '--learning_rate': 3e-5,
                          '--num_train_epochs': 2.0,
                          '--doc_stride': 128,
                          '--seed': 32,
                          '--gradient_accumulation_steps':4,
                          '--warmup_proportion':0.25,
                          '--output_dir': './outputs',
                          '--fp16':'',
                          #'--loss_scale':128,
                    },
                    entry_script='./finetune/run_squad_azureml.py',
                    node_count=1,
                    process_count_per_node=4,
                    distributed_backend='mpi',
                    use_gpu=True)

# path to the Python environment in the custom Docker image
estimator._estimator_config.environment.python.interpreter_path = '/opt/miniconda/envs/amlbert/bin/python'
run = experiment.submit(estimator)
from azureml.widgets import RunDetails
RunDetails(run).show()

However, when I run this script, I get this error.
error": {
"message": "{\n "error_details": {\n "correlation": {\n "operation": "0b58ac6218ccb845aeae7d20056dfba1",\n "request": "U4WAXOUHsWo="\n },\n "environment": "koreacentral",\n "error": {\n "code": "UserError",\n "message": "Communicators are not supported for local runs."\n },\n "location": "koreacentral",\n "time": "2019-11-03T03:11:31.494157+00:00"\n },\n "status_code": 400,\n "url": "https://koreacentral.experiments.azureml.net/execution/v1.0/subscriptions/8170d900-06ad-4a1d-babd-1a30120ea257/resourceGroups/Bertpipeline/providers/Microsoft.MachineLearningServices/workspaces/BertSquad/experiments/BERT-SQuAD/localrun?runId=BERT-SQuAD_1572750686_983087f7\"\n}"
}

Is there any way to handle this issue?

Thanks for providing such an amzing repo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant