Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: tensorflow op ExtractImagePatches is not supported #436

Open
ayermolo opened this issue Apr 2, 2019 · 30 comments · May be fixed by #2188
Open

ValueError: tensorflow op ExtractImagePatches is not supported #436

ayermolo opened this issue Apr 2, 2019 · 30 comments · May be fixed by #2188
Labels
contribution welcome Community contribution is welcomed unsupported ops Issues related to unsupported operators

Comments

@ayermolo
Copy link

ayermolo commented Apr 2, 2019

I am trying to convert yolov2-voc generated by DarkFlow in to ONNX, and running in to this error
"ValueError: tensorflow op ExtractImagePatches is not supported"

I am using tensorflow=1.9.0, onnx=1.4.1, opset=7, tfonnx=1.5.0/9fcb06.

Is this tensorflow issue, or the converter? How would I got about fixing this?

Thank You

@nbcsm
Copy link
Collaborator

nbcsm commented Apr 2, 2019

currently, for TF ExtractImagePatches op, there is no equivalent op in onnx.
would need to propose and add such op into future opset.

@ayermolo
Copy link
Author

ayermolo commented Apr 2, 2019

@nbcsm Thank you for reply. What would be the process for that?

@nbcsm
Copy link
Collaborator

nbcsm commented Apr 2, 2019

go to https://github.com/onnx/onnx, and open an issue for ExtractImagePatches

@ayermolo
Copy link
Author

ayermolo commented Apr 2, 2019

Great, thanks

@prasanthpul
Copy link
Member

@nbcsm can it be composed in the converter?

@ayermolo
Copy link
Author

ayermolo commented Apr 3, 2019

Along the same lines. Translated in to something else in special cases:
I think ExtractImagePatch with ksizes = [1, blk, blk, 1], stride = [1, blk, blk, 1], rates = [1, 1, 1, 1] is equivalent to SpaceToDepth with blocksize = blk, blk.

@nbcsm
Copy link
Collaborator

nbcsm commented Apr 3, 2019

for that case (ExtractImagePatch with ksizes = [1, blk, blk, 1], stride = [1, blk, blk, 1], rates = [1, 1, 1, 1]), yes, SpaceToDepth seems to work.
but it may not work for other ExtractImagePatch cases.

it feels like we may use MaxPool + SpaceToDepth to convert ExtractImagePatch, but need to double check.

@nbcsm nbcsm added need investigation Need investigation contribution welcome Community contribution is welcomed and removed needs onnx change need investigation Need investigation labels Apr 11, 2019
@xtremertx
Copy link

xtremertx commented Aug 5, 2019

Hi, I'm having the same issue here when trying to convert YOLOv2 full network (416x416) into ONNX and I really need ExtractImagePatch op, I was trying to use tensorflow model instead but it doesnt worked, as ML.NET failed to load it. Please add support for this or at least some workaround code?

How do I define own special op for that?

@xtremertx
Copy link

xtremertx commented Aug 5, 2019

@nbcsm Hi can someone just paste here the ExtractImagePatch handler that is using SpaceToDepth ? I'm new in this area and writing own operations seems quite hard to me, I need some assistence, thanks.

@ayermolo
Copy link
Author

ayermolo commented Aug 5, 2019

Are you using DarkFlow?
It's pretty straightforward: ayermolo/darkflow@c62d503

@xtremertx
Copy link

xtremertx commented Aug 5, 2019

Yes, I have used DarkFlow and converted darknet (.weights) model into tensorflow (.pb), because only tensorflow and ONNX are supported formats in ML.NET which I'm using. However tensorflow model throwed error while loading it using ML.NET (I have tested that model using darkflow and detections were working there, will report that issue to MS). So logical step was to try to convert my model into ONNX that is also supported.

Gonna to try your commit, thanks a lot.

@jhagege
Copy link

jhagege commented Oct 26, 2020

Hi, any update on this ?
I am trying to convert a model using this operator in Tensor flow without success.
Thanks for any assistance !

@TomWildenhain-Microsoft
Copy link
Contributor

#1543 features another model using this op. ksizes = [1, 3, 3, 1], rates = [1, 1, 1, 1], strides = [1, 1, 1, 1], padding = 'SAME'/'VALID'

@spillerrec
Copy link

I have tried to make a numpy implementation of ExtractImagePatches:

def numpy_eip(arr, ksizes, strides, rates, padding):
	sizes = [1, ksizes[1]*rates[1] - (rates[1]-1), ksizes[2]*rates[2] - (rates[2]-1), 1]
	
	if padding == 'SAME':
		extra_i = max(0, (arr.shape[1]-1) // strides[1] * strides[1] + sizes[1] - arr.shape[1])
		extra_j = max(0, (arr.shape[2]-1) // strides[2] * strides[2] + sizes[2] - arr.shape[2])
		arr = np.pad(arr, [(0,0), (extra_i//2, extra_i//2 + extra_i%2), (extra_j//2, extra_j//2 + extra_j%2), (0,0)])
	elif padding != 'VALID':
		raise Exception('Padding type "%s" is not supported' % padding)
	
	def make_range(in_size, k_size, rate, stride):
		return range(0, in_size - (k_size*rate - rate), stride)
	indexes_i = make_range(arr.shape[1], ksizes[1], rates[1], strides[1])
	indexes_j = make_range(arr.shape[2], ksizes[2], rates[2], strides[2])

	batch_size = arr.shape[0]
	channel_size = ksizes[1]*ksizes[2]*arr.shape[3]

	return np.concatenate([np.concatenate([
		arr[:, i : sizes[1]+i : rates[1], j : sizes[2]+j : rates[2], :].reshape([batch_size, 1, 1, channel_size])
			for j in indexes_j], axis=2)
				for i in indexes_i], axis=1)

Note that the TF documentation says that ksizes, strides, and rates must be on the form [1, a, b, 1]. ExtractImagePatches can probably be seen as a generalized version of SpaceToDepth where the blocks/tiles may overlap or have gaps between them.
If the dimensions of the input array is known, I believe it should be possible to map this to a series of slices, reshape/flatten, and concatenations. (The last 4 lines.) In my case we are talking about 32x32/30x30 of those which is a bit excessive. Any ideas on how to do this differently? I do not understand what the MaxPool idea is about.

@TomWildenhain-Microsoft
Copy link
Contributor

If we want to do an onnx op, could we do a conv with an identity kernel?

@spillerrec
Copy link

I have done a quick test with my ksizes = [1, 3, 3, 1], rates = [1, 1, 1, 1], strides = [1, 1, 1, 1], padding = 'VALID' case and it appears to work:
eip_valid.zip
(I have yet to test it in the full model, I will take a few days off this so I will check that later this week.)
The main issue is that this is 20MB uncompressed which is a bit of a shame. But other than that this could be a quite good mapping if the rest of the stride/rate fits the Conv behavior AND you have a low ksize and channel dimension. This gets more and more bloated as ksize increases.

@TomWildenhain-Microsoft
Copy link
Contributor

I think you should be able to move the M and C values of 2304 and 256 to the batch dimension, do the conv with a simple [1, 1, 3, 3] kernel and put the batch dims back. That should make the kernel much smaller.

@spillerrec
Copy link

Minimal effort test without batch handling, seems to work, and down to 1 KiB with a [9, 1, 3, 3] kernel: eip_valid.zip
This is using a reshape however which I think would be nice to avoid, as it requires you to specify a size. I like the concept of dynamic axes at least.

@TomWildenhain-Microsoft
Copy link
Contributor

Perfect. Actually if reshape sees a -1 in the shape it will dump all the remaining dimensions into there, and a 0 indicates to keep that dim the same. Since the input is [9, 256, 30, 30], try [1, -1, 0, 0] instead of [1, 2304, 30, 30]. It is also common to use Shape, Slice, Concat, and Reshape together to make the shapes dynamic. Should be cheap to execute since all those ops are small.

@spillerrec
Copy link

That is really useful, I could fix the rest of the Reshapes in my model using that as well.
I have not yet looked at how this Conv approach works for other cases, but I have converted my model and I'm quite happy with the result. It works perfectly and allows me to have dynamic input sizes as well (after fixing several other issues with Reshape, Resize, and ConvTranspose).
For the parameters:

  • ksizes/sizes: Works, though it might not be a good fit with with large values, such as 32x32. I don't know if such usage is common.
  • stride: Should map perfectly to the stride in the Conv op I believe
  • rate: This adds a stride to the ksize x ksize inputs. I don't know if this is what dilations is for Conv, but otherwise I'm not sure what could be done for this. ("dilation value along each spatial axis of the filter" doesn't really explain anything.)
  • padding: VALID appears to match Conv's behavior with ksize and stride so nothing needs to be done here. SAME adds zero padding in a way that is dependent on the input size and can easily be done with just adding padding of the appropriate size. If we are lucky the auto_pad attribute of Conv behaves the same, but otherwise it just means calculating the appropriate padding for that specific input size.

Assuming we have a way to convert ExtractImagePatches to either Conv or SpaceToDepth in either all cases or in specific cases, what can/needs to be done to move this forward and what can I help with? Could this be done by the converter (the Conv approach technically is a change of semantics even though the behavior is the same), or could we make a guide on what options you have to deal with these models instead (i.e. custom OP implementation, replacing it with common OPs, etc.). Is there still a need for pushing ExtractImagePatches to get standalized assuming we have a workaround with standard operators?

@TomWildenhain-Microsoft
Copy link
Contributor

@spillerrec it would be fantastic if you could add support for this op in tf2onnx. Add a handler for ExtractImagePatches in nn.py https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/onnx_opset/nn.py You can copy CTCGreedyDecoder as a template. Use the lowest onnx opset you can get away with https://github.com/onnx/onnx/blob/master/docs/Operators.md. Then add some unit tests to test_backend.py https://github.com/onnx/tensorflow-onnx/blob/master/tests/test_backend.py

It is perfectly fine to only be able to convert certain cases. Use utils.make_sure(<condition>, "explanation") to require a certain condition for conversion.

It's fine that this conversion uses a composition of ops and might have different performance characteristics from the original TF op. That's pretty common in the converter. Since this is a pre-processing op, hopefully the inference time won't be dominated by it. For large sizes hopefully we can use SpaceToDepth instead, since I think those cases have strides matching the size (otherwise the output would be huge anyway).

As you are working, open a PR and I'll be happy to comment if you run into any issues.

@spillerrec
Copy link

@TomWildenhain-Microsoft Thanks for the pointers, I will give it a go.

@TomWildenhain-Microsoft
Copy link
Contributor

@spillerrec Just checking if you've had a chance to work on this. Happy to help if you are stuck.

@spillerrec
Copy link

@TomWildenhain-Microsoft I'm sorry about the lack of updates, no, I have not looked at it yet. However I have next week off, so I will not have an excuse now for not getting some progress done on it. ; )

@savvaki
Copy link

savvaki commented Dec 9, 2021

@spillerrec Are there any updates on this issue?

@spillerrec
Copy link

@savvaki No, not anything else than I looked into the dilations parameter and that it should be possible to use that to implement rate.

To sum up, the current status is that we have a decent workaround using standard operators and most people would probably be happy with it. It is currently stalled on me to actually add it.

I honestly forgot about this issue, so thanks for the ping. I found out that mingw can't be used with onnxruntime on Windows (and I have no intention to use Visual Studio), so this stalled my own ONNX related project (and will probably stay so until I switch back to Linux). I'm still willing to look at it, just don't assume it will be anytime soon with my track record and certainly not before January. If anyone else is interested in getting this done, don't hold back because of me.

@savvaki
Copy link

savvaki commented Dec 13, 2021

@spillerrec Do you perhaps have the tf code lying around that you used to generate the onnx models that you posted above?

@savvaki
Copy link

savvaki commented Dec 14, 2021

I created an "equivalent" operation to tf.image.extract_patches using an identity kernel with a convolution. The code seemingly works for batches, as well as different patch sizes. In addition, multiple input feature maps seem to work. This is an code snippet if anyone is interested. I have not tried different padding variations/strides, meaning that this snippet is not complete for more exotic use cases. Feel free to modify.

import tensorflow as tf
import numpy as np

# Define some parameters
input_shape = [5, 10, 10, 6]
patch_size = 3 # height/width of the square image patch size
padding = 'SAME'

# Generate example input
x_original = tf.cast(tf.reshape(tf.range(np.prod(input_shape)), input_shape), dtype=tf.float32)

x = tf.transpose(x_original, perm=[0, 3, 1, 2]) # Move feature map dimension next to the batch dimension
x = tf.expand_dims(x, -1) # Add extra channel at the end 

# Create an identity kernel
kernel = tf.reshape(tf.eye(patch_size**2), [patch_size, patch_size, 1, patch_size**2]) # [filter_height, filter_width, in_channels, out_channels]

# Convolve with identity kernel
patches_simulation = tf.nn.conv2d(x, kernel, strides=[1, 1, 1, 1], padding='VALID')
patches_simulation = tf.transpose(patches_simulation, perm=[0 ,2 ,3, 4, 1]) # Move filter dim to last
patches_simulation_shape = tf.shape(patches_simulation)
patches_simulation = tf.reshape(patches_simulation, [patches_simulation_shape[0], patches_simulation_shape[1], patches_simulation_shape[2], -1]) # Merge last two dims into one

# Intended output to compare against
patches = tf.image.extract_patches(x_original, sizes=[1, patch_size, patch_size, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1], padding='VALID')

# Print out check to see if the alternative method is identical to tf.image.extract_patches
print(f'tf.image.extract_patches shape {patches.shape} simulation shape {patches_simulation.shape} same shape: {patches.shape == patches_simulation.shape}')
print(f'Simulation is correct? {tf.reduce_all(tf.math.equal(patches, patches_simulation)).numpy()}')

@ManishwarG
Copy link

I am trying to convert a custom developed model with a custom convolution layer and I implemented it with tf.image.extract_patches and running in to this error while Coverting to tflite model.
"RuntimeError: Failed to initialize op resolver for calibration:
There are unresolved custom ops: [ExtractImagePatches]Encountered unresolved custom op: ExtractImagePatches."
I am using tensorflow=2.8
I know that tflite doesn't support tf.image.extract_patches, is there any way to get around it? or custom function for tf.image.extract_patches?

@kobiche
Copy link

kobiche commented Mar 21, 2023

Hi! Any updates? I'm also waiting for this update :(

[UPDATE]
I found this implementation that works for me:
https://github.com/PINTO0309/openvino2tensorflow/blob/63fff406a2bd44258c91521add6efc8acf899808/openvino2tensorflow/openvino2tensorflow.py#L6821

@nanoskript nanoskript linked a pull request Jun 16, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution welcome Community contribution is welcomed unsupported ops Issues related to unsupported operators
Projects
None yet
Development

Successfully merging a pull request may close this issue.