Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with training yolov8 on TPU #1809

Open
martinsaieh96 opened this issue Mar 20, 2024 · 0 comments
Open

Problem with training yolov8 on TPU #1809

martinsaieh96 opened this issue Mar 20, 2024 · 0 comments
Assignees
Labels

Comments

@martinsaieh96
Copy link

martinsaieh96 commented Mar 20, 2024

I'm trying to adapt the Object Detection with KerasCV tutorial link to train YoloV8 on a TPU in Google Colab, but I get an error when trying to train.

Code

tpu_address = TF_MASTER
try:
    resolver = tf.distribute.cluster_resolver.TPUClusterResolver(TF_MASTER)
    tf.config.experimental_connect_to_cluster(resolver)
    tf.tpu.experimental.initialize_tpu_system(resolver)
    strategy = tf.distribute.experimental.TPUStrategy(resolver)
except ValueError:
    print("No se encontró la TPU. Usando CPU o GPU.")
    strategy = tf.distribute.get_strategy()

tpu_strategy = tf.distribute.TPUStrategy(resolver)
BATCH_SIZE = 16 * tpu_strategy.num_replicas_in_sync

with strategy.scope():
    model = keras_cv.models.YOLOV8Detector.from_preset(
        "resnet50_imagenet",
        bounding_box_format="xywh",
        num_classes=20,
    )
    model.compile(
        classification_loss="binary_crossentropy",
        box_loss="ciou",
        optimizer=optimizer,
    )

Error output

InvalidArgumentError                      Traceback (most recent call last)
[<ipython-input-19-3ac54344eb4a>](https://localhost:8080/#) in <cell line: 1>()
----> 1 model.fit(
      2     train_ds.take(20),
      3     # Run for 10-35~ epochs to achieve good scores.
      4     epochs=1,
      5     callbacks=[coco_metrics_callback],

1 frames
[/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   5981 def raise_from_not_ok_status(e, name) -> NoReturn:
   5982   e.message += (" name: " + str(name if name is not None else ""))
-> 5983   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   5984 
   5985 

InvalidArgumentError: Unable to parse tensor proto [Op:DatasetCardinality] name:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants