gh-84016: Implement missing `PyObject_CopyToObject` in `PEP3118` #124747

rruuaanng · 2024-09-29T06:48:38Z

Issue: Implement PyObject_CopyToObject #84016

https://peps.python.org/pep-3118/
PEP3118 specifies the following API requirements

Py_ssize_t PyBuffer_SizeFromFormat(const char *)

PyObject * PyMemoryView_GetContiguous(PyObject *obj,  int buffertype,
                                      char fortran)

int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len,
                          char fortran)

int PyObject_CopyData(PyObject *dest, PyObject *src)

int PyBuffer_IsContiguous(Py_buffer *view, char fortran)

void PyBuffer_FillContiguousStrides(int ndim, Py_ssize_t *shape,
                                    Py_ssize_t *strides, Py_ssize_t itemsize,
                                    char fortran)

int PyBuffer_FillInfo(Py_buffer *view, void *buf,
                      Py_ssize_t len, int readonly, int infoflags)

However, in the existing implementation, there is no PyObject_CopyToObject. So, the significance of this PR is this.

picnixz

Tests are required for this functionality. You can have a look at how tests are implemented in the Modules/_testcapi folder.

Objects/abstract.c

Doc/c-api/buffer.rst

Misc/NEWS.d/next/C_API/2024-09-29-14-34-53.gh-issue-84016.f1ThMN.rst

Objects/abstract.c

picnixz · 2024-09-29T13:22:29Z

Please @rruuaanng, before marking a review comment as resolved, either add an emote to indicate whether you made the changes or not, or if you did not make the changes, write a comment to explain why and let the reviewer close the discussion once they think it's resolved.

rruuaanng · 2024-09-29T13:33:59Z

Please @rruuaanng, before marking a review comment as resolved, either add an emote to indicate whether you made the changes or not, or if you did not make the changes, write a comment to explain why and let the reviewer close the discussion once they think it's resolved.

Oh sorry, the incidents I marked resolved have been submitted, but there is no reply.

rruuaanng · 2024-09-29T15:20:16Z

As I commented on the issue, I think we need to get some more evidence that this is actually needed.

I propose an application scenario. If you need to copy the data of a C array to an object with high performance, this seems to be the most direct way, and it also meets the operation interface of PEP3118.

Edit: The second reason is that PEP3118 does not implement this interface, making the standard incomplete. In fact, this makes the coverage of the standard smaller.

…h84016

encukou · 2024-10-01T09:39:17Z

PEP-3118 is an enhancement proposal, not a standard. If this wasn't done together with the original change in 3.0, it should be treated as a new change.

rruuaanng · 2024-10-01T09:42:22Z

PEP-3118 is an enhancement proposal, not a standard. If this wasn't done together with the original change in 3.0, it should be treated as a new change.

Yes, I think you are right, but this commit I think is very meaningful, because it makes it possible to quickly copy C memory into objects. And I added corresponding documentation, stable instructions and corresponding tests for this (currently being revised).

picnixz

General remarks:

Several core developers have expressed concern about the usefulness of this API. Even if it is part of a PEP, it would be good to know why it was not implemented with the rest of the API.
Is it actually possible to simply use PyBuffer_FromContiguous instead? namely doing everything using a Py_buffer object and then using it? If so, I don't think it's worth the addition of that API.
Tests are insufficient IMO. They should cover more cases, e.g., when the buffer is not filled until the end.

Following those observations, I'd like not to be requested for a review until we are sure to add this interface.

picnixz · 2024-10-02T17:57:43Z

Objects/abstract.c

+    /* just copy it directly through memcpy */
+    if (fort == 'C' || fort == 'F') {
+        memcpy(view_obj.buf, buf, len);
+        PyBuffer_Release(&view_obj);
+        return 0;
+    }


This case is still incorrect; memcpy should only be used if the destination is one-dimensional.

When the target is multi-dimensional, their memory is still continuous. Refer to PyObject_CopyData for details.

I'm sorry but maybe there is some misunderstanding. I suggest reading this post to understand the difference between a C and a F-style ordering (namely row vs column-major order): https://manik.cc/2021/02/25/memory-order.html.

Note that PyObject_CopyData only uses memcpy when the memory is contiguous, and this is checked via PyBuffer_IsContiguous.

rruuaanng · 2024-10-03T00:26:42Z

General remarks:

* Several core developers have expressed concern about the usefulness of this API. Even if it is part of a PEP, it would be good to know why it was not implemented with the rest of the API.

* Is it actually possible to simply use `PyBuffer_FromContiguous` instead? namely doing everything using a `Py_buffer` object and then using it? If so, I don't think it's worth the addition of that API.

* Tests are insufficient IMO. They should cover more cases, e.g., when the buffer is not filled until the end.

Following those observations, I'd like not to be requested for a review until we are sure to add this interface.

I agree with picnixz, because I noticed that there is already a PyBuffer_FromContiguous. So I think my PR may be able to change from a new function to an alias of PyBuffer_FromContiguous, so as to fully implement PEP3118 and reuse its functionality.

Edit: My reasoning for this is that PyBuffer_FromContiguous is a weird name for PyObject_CopyToObject, which is not that intuitive (it’s important that users can directly know from the pep3118 homepage that there is such an API).

Like this

int
PyObject_CopyToObject(PyObject *obj, const void *buf, Py_ssize_t len, char fort)
{
    Py_buffer view_obj;

    if (PyObject_GetBuffer(obj, &view_obj, PyBUF_FULL)) {
        return -1;
    }
    return PyBuffer_FromContiguous(&view_obj, buf, len, fort);
}

Edit: And I personally think that although it is difficult to find the implementation author at present, it seems that the name of PyObject_CopyToObject in the existing implementation is PyBuffer_FromContiguous. I think it may be possible to change this name or use the alias I suggested above.

rruuaanng · 2024-10-03T00:41:47Z

cc @vstinner

vstinner · 2024-10-03T10:20:19Z

@JelleZijlstra:

As I commented on the issue, I think we need to get some more evidence that this is actually needed.

I concur with Jelle. I'm not convinced that we need this feature.

vstinner · 2024-10-03T10:13:46Z

Doc/c-api/buffer.rst

+
+   Copy data from *buf* to *obj* buffer. Can convert between C-style and
+   or Fortran-style buffers (C-style if *fort* is ``'C'`` or Fortran-style
+   if *fort* is ``'F'``, ``'A'`` or other for defalut quick copy).


Suggested change

if *fort* is ``'F'``, ``'A'`` or other for defalut quick copy).

if *fort* is ``'F'``, ``'A'`` or other for default quick copy).

vstinner · 2024-10-03T10:14:08Z

Doc/c-api/buffer.rst

+   or Fortran-style buffers (C-style if *fort* is ``'C'`` or Fortran-style
+   if *fort* is ``'F'``, ``'A'`` or other for defalut quick copy).
+
+   ``0`` is returned on success, ``-1`` on error.


Suggested change

``0`` is returned on success, ``-1`` on error.

Return ``0`` on success, or return ``-1`` on error.

vstinner · 2024-10-03T10:14:53Z

Lib/test/test_capi/test_abstract.py

+        s1 = copy_to_object(bytearray(3), 'abc', b'C')
+        s2 = copy_to_object(bytearray(3), 'abc', b'F')
+        s3 = copy_to_object(bytearray(3), 'abc', b'A')
+        self.assertEqual(s1, s2)


I would prefer to compare it to a known bytearray, such as: bytearray(b'abc').

vstinner · 2024-10-03T10:16:18Z

Modules/_testcapi/abstract.c

+    int result;
+    char *buf, fort;
+
+    if (!PyArg_ParseTuple(args, "Os#c", &obj, &buf, &len, &fort)) {


You should not expect Unicode, but raw bytes:

Suggested change

if (!PyArg_ParseTuple(args, "Os#c", &obj, &buf, &len, &fort)) {

if (!PyArg_ParseTuple(args, "Oy#c", &obj, &buf, &len, &fort)) {

vstinner · 2024-10-03T10:18:37Z

Misc/stable_abi.toml

@@ -2283,6 +2283,8 @@
    added = '3.11'
 [function.PyObject_CopyData]
    added = '3.11'
+[function.PyObject_CopyToObject]


Don't add this function directly to the stable ABI: remove this entry.

vstinner · 2024-10-03T10:19:08Z

Lib/test/test_capi/test_abstract.py

+        self.assertRaises(BufferError, copy_to_object, bytearray(2), 'abc', b'A')
+        self.assertRaises(TypeError, copy_to_object, list(), 'abc', b'C')
+        self.assertRaises(TypeError, copy_to_object, list(), 'abc', b'F')
+        self.assertRaises(TypeError, copy_to_object, list(), 'abc', b'A')


You should test read-only buffer.

vstinner · 2024-10-03T10:21:54Z

@rruuaanng: I suggest you to put the implementation on hold until we decide if this function is really needed. Instead, you can answer to @JelleZijlstra questions. See also his question on the issue.

rruuaanng · 2024-10-03T10:26:28Z

I have already given the answer to this question above. And wait for the results of the core discussion.

vstinner · 2024-10-03T10:31:15Z

@rruuaanng:

I have already given the answer to this question above. And wait for the results of the core discussion.

I'm thinking at this question of @JelleZijlstra:

If you'd like this function implemented, could you provide examples of existing code (e.g., in existing C extensions) that would be made better by this function?

I didn't see any convincing example so far.

rruuaanng · 2024-10-03T10:41:50Z

@rruuaanng:

I have already given the answer to this question above. And wait for the results of the core discussion.

I'm thinking at this question of @JelleZijlstra:

If you'd like this function implemented, could you provide examples of existing code (e.g., in existing C extensions) that would be made better by this function?

I didn't see any convincing example so far.

It seems that I didn’t see this issue. However, that’s fine. My response to the issue is that if the user needs to directly write to an already encapsulated PyObject, rather than obtaining a buffer through PyObject_GetBuffer and performing other operations, they can directly choose the API implemented in this PR. It provides an intuitive and fast way to write the desired data into a PyObject.
(I would like to reiterate that this PR is a supplement to the missing API in PEP 3118. Although there is a function PyBuffer_FromContiguous for operating on Py_buffer, it still requires using PyObject_GetBuffer rather than the quick and direct operation I mentioned earlier).

I'm actually planning to modify my PR to make it an alias, which would fully implement PEP 3118 while reusing existing functionality to make the API more expressive.

int
PyObject_CopyToObject(PyObject *obj, const void *buf, Py_ssize_t len, char fort)
{
    Py_buffer view_obj;

    if (PyObject_GetBuffer(obj, &view_obj, PyBUF_FULL)) {
        return -1;
    }
    return PyBuffer_FromContiguous(&view_obj, buf, len, fort);
}

Edit: I think this makes sense for users, because intuitively, most programmers will choose XxxToXxx for writing xxx to xxx, rather than implementing a combination of multiple operations.

Edit：Please give me some time if you can. I'm looking for a stronger proof.

rruuaanng · 2024-10-04T09:04:01Z

In pygl/src/buffers/buffer.c

static PyObject *buf_read(PyBuffer *self, PyObject *args, PyObject *kwargs)
{
    static char *kwNames[] = {"out", "size", "offset", NULL};

    PyObject *result = NULL;

    Py_buffer outBuffer = {0};
    GLsizeiptr size = 0;
    GLintptr offset = 0;
    if (!PyArg_ParseTupleAndKeywords(
            args, kwargs, "w*L|L", kwNames,
            &outBuffer, &size, &offset))
        goto end;

    if (!utils_check_buffer_contiguous(&outBuffer))
        goto end;

    THROW_IF_GOTO(
        size + offset > self->size,
        PyExc_ValueError,
        "Requested size and offset exceed buffer size.",
        end);

    if (size > outBuffer.len)
    {
        PyErr_Format(PyExc_ValueError, "Buffer that was passed in is smaller than requested size. (buffer size: %zu, requested: %zu)", outBuffer.len, size);
        goto end;
    }

    if (self->flags & GL_DYNAMIC_STORAGE_BIT)
        glGetNamedBufferSubData(self->id, offset, size, outBuffer.buf);
    else
    {
        THROW_IF_GOTO(
            !(self->flags & GL_MAP_READ_BIT),
            PyExc_RuntimeError,
            "Mappable buffer has to have flag MAP_READ_BIT set to allow reading.",
            end);

        THROW_IF_GOTO(
            self->dataPtr == NULL,
            PyExc_RuntimeError,
            "Mappable buffer has to be mapped before attempting to read data. Map buffer or use MAP_PERSISTENT_BIT.",
            end);

        int result = PyBuffer_FromContiguous(&outBuffer, (char *)self->dataPtr + offset, size, 'C');
        THROW_IF_GOTO(
            result == -1,
            PyExc_RuntimeError,
            "Couldn't read data from buffer.",
            end);
    }

    result = Py_NewRef(Py_None);

end:
    if (outBuffer.buf != NULL)
        PyBuffer_Release(&outBuffer);

    return result;
}

The following it's my change

static PyObject *buf_read(PyBuffer *self, PyObject *args, PyObject *kwargs)
{
    ...
    if (!PyArg_ParseTupleAndKeywords(
            args, kwargs, "OL|L", kwNames,
            &obj, &size, &offset))
        goto end;
    
    ...

    PyObject_GetBuffer(&obj, &view_obj, PyBUF_FULL);
    if (self->flags & GL_DYNAMIC_STORAGE_BIT) {
        glGetNamedBufferSubData(self->id, offset, size, view_obj.buf);
    }
    PyObject_CopyToObject(obj, (char *)self->dataPtr + offset, size, 'C');
    ...
}

Hmmm, this example doesn't seem to be the most appropriate, but I did look into buffer operations in libraries such as NumPy and SciPy. It seems that they use their own implemented APIs instead of the API provided by CPython. For the modified function I provided (although the code is really not good, please ignore the other parts I did not change and only focus on the function return parameters), I believe this simplifies a lot of redundant checks (such as continuity and buffer size). I want to emphasize that users do not need to write multiple function combinations; they can just use one function. In addition, the buffer it returns is PyObject instead of Py_buffer, which makes the code more general (it seems a bit far-fetched, but I'm sorry, I really can't find other examples)

erlend-aasland · 2024-10-15T08:18:33Z

@rruuaanng, instead of prematurely opening a PR, wait for consensus on Discourse or on the issue before proposing a change. I'm closing this as of the comments from several core devs and triagers. Feel free to re-open this PR on your on your own fork. You can reopen it here if there is consensus on implementing this feature.

Thanks!

Added public PyObject_CopyToObject() to pybuffer.h

5124891

bedevere-app bot added the awaiting review label Sep 29, 2024

rruuaanng marked this pull request as draft September 29, 2024 06:48

bedevere-app bot removed the awaiting review label Sep 29, 2024

rruuaanng changed the title ~~Implement PyObject_CopyToObject in PEP3118~~ gh-84016: Implement PyObject_CopyToObject in PEP3118 Sep 29, 2024

bedevere-app bot mentioned this pull request Sep 29, 2024

Implement PyObject_CopyToObject #84016

Closed

rruuaanng changed the title ~~gh-84016: Implement PyObject_CopyToObject in PEP3118~~ gh-84016: Implement PyObject_CopyToObject in **PEP3118** Sep 29, 2024

rruuaanng changed the title gh-84016: Implement PyObject_CopyToObject in **PEP3118** gh-84016: Implement PyObject_CopyToObject in PEP3118 Sep 29, 2024

rruuaanng added 3 commits September 29, 2024 15:03

int \n change to int

686673e

Change error string

2eff13d

fort rename to fortran

4569791

rruuaanng changed the title ~~gh-84016: Implement PyObject_CopyToObject in PEP3118~~ gh-84016: Implement the missing PyObject_CopyToObject in PEP3118 Sep 29, 2024

rruuaanng changed the title ~~gh-84016: Implement the missing PyObject_CopyToObject in PEP3118~~ gh-84016: Implement missing PyObject_CopyToObject in PEP3118 Sep 29, 2024

rruuaanng added 4 commits September 29, 2024 15:30

Add document

a91c5ea

Remove useless code

df16015

Add \n

50bfb42

Remove space

6498381

rruuaanng marked this pull request as ready for review September 29, 2024 09:35

bedevere-app bot added the awaiting review label Sep 29, 2024

Change style

3c0436e

rruuaanng requested review from a team and encukou as code owners September 29, 2024 10:44

rruuaanng force-pushed the gh84016 branch from 302420c to 3c0436e Compare September 29, 2024 10:50

picnixz reviewed Sep 29, 2024

View reviewed changes

rruuaanng added 2 commits September 29, 2024 21:07

Change NEWS

0e7de15

Change review

1d797ef

Change error string

3388eff

Change doc

6e9bfd8

rruuaanng added 3 commits September 29, 2024 23:52

Re implement PyObject_CopyToObject

e83402a

Change doc

981b6c5

Regen stable_abi

7bc039e

rruuaanng marked this pull request as draft October 1, 2024 03:53

bedevere-app bot removed the awaiting review label Oct 1, 2024

rruuaanng added 2 commits October 1, 2024 13:30

Change PyObject_CopyToObject

edd093d

Merge branch 'gh84016' of https://github.com/rruuaanng/cpython into g…

66101e0

…h84016

Add test

611ad22

rruuaanng force-pushed the gh84016 branch from 4632163 to 611ad22 Compare October 1, 2024 10:28

rruuaanng requested a review from picnixz October 1, 2024 12:58

rruuaanng added 2 commits October 2, 2024 09:43

Change to bytearray

eb1a891

Change test

1153275

rruuaanng marked this pull request as ready for review October 2, 2024 05:56

bedevere-app bot added the awaiting review label Oct 2, 2024

picnixz reviewed Oct 2, 2024

View reviewed changes

vstinner reviewed Oct 3, 2024

View reviewed changes

erlend-aasland closed this Oct 15, 2024

rruuaanng deleted the gh84016 branch October 15, 2024 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-84016: Implement missing `PyObject_CopyToObject` in `PEP3118` #124747

gh-84016: Implement missing `PyObject_CopyToObject` in `PEP3118` #124747

rruuaanng commented Sep 29, 2024 •

edited

Loading

picnixz left a comment

picnixz commented Sep 29, 2024 •

edited

Loading

rruuaanng commented Sep 29, 2024

rruuaanng commented Sep 29, 2024 •

edited

Loading

encukou commented Oct 1, 2024

rruuaanng commented Oct 1, 2024

picnixz left a comment

picnixz Oct 2, 2024

rruuaanng Oct 2, 2024

picnixz Oct 3, 2024

rruuaanng commented Oct 3, 2024 •

edited

Loading

rruuaanng commented Oct 3, 2024

vstinner commented Oct 3, 2024

vstinner Oct 3, 2024

vstinner Oct 3, 2024

vstinner Oct 3, 2024

vstinner Oct 3, 2024

vstinner Oct 3, 2024

vstinner Oct 3, 2024

vstinner commented Oct 3, 2024

rruuaanng commented Oct 3, 2024

vstinner commented Oct 3, 2024 •

edited

Loading

rruuaanng commented Oct 3, 2024 •

edited

Loading

rruuaanng commented Oct 4, 2024 •

edited

Loading

erlend-aasland commented Oct 15, 2024

	if fort is ``'F'``, ``'A'`` or other for defalut quick copy).
	if fort is ``'F'``, ``'A'`` or other for default quick copy).

	``0`` is returned on success, ``-1`` on error.
	Return ``0`` on success, or return ``-1`` on error.

	if (!PyArg_ParseTuple(args, "Os#c", &obj, &buf, &len, &fort)) {
	if (!PyArg_ParseTuple(args, "Oy#c", &obj, &buf, &len, &fort)) {

gh-84016: Implement missing PyObject_CopyToObject in PEP3118 #124747

gh-84016: Implement missing PyObject_CopyToObject in PEP3118 #124747

Conversation

rruuaanng commented Sep 29, 2024 • edited Loading

picnixz left a comment

Choose a reason for hiding this comment

picnixz commented Sep 29, 2024 • edited Loading

rruuaanng commented Sep 29, 2024

rruuaanng commented Sep 29, 2024 • edited Loading

encukou commented Oct 1, 2024

rruuaanng commented Oct 1, 2024

picnixz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rruuaanng commented Oct 3, 2024 • edited Loading

rruuaanng commented Oct 3, 2024

vstinner commented Oct 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vstinner commented Oct 3, 2024

rruuaanng commented Oct 3, 2024

vstinner commented Oct 3, 2024 • edited Loading

rruuaanng commented Oct 3, 2024 • edited Loading

rruuaanng commented Oct 4, 2024 • edited Loading

erlend-aasland commented Oct 15, 2024

gh-84016: Implement missing `PyObject_CopyToObject` in `PEP3118` #124747

gh-84016: Implement missing `PyObject_CopyToObject` in `PEP3118` #124747

rruuaanng commented Sep 29, 2024 •

edited

Loading

picnixz commented Sep 29, 2024 •

edited

Loading

rruuaanng commented Sep 29, 2024 •

edited

Loading

rruuaanng commented Oct 3, 2024 •

edited

Loading

vstinner commented Oct 3, 2024 •

edited

Loading

rruuaanng commented Oct 3, 2024 •

edited

Loading

rruuaanng commented Oct 4, 2024 •

edited

Loading