Why does Python copy numpy arrays where the length of the dimensions are the same?












6















I have a problem with referencing to a numpy array.
I have an array of the form



import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]


and if I now create a new variable



b = np.array(a)


and do



b[0] += 1
print(a)


then a is not changing.



a = [array([0. , 0.2, 0.4, 0.6, 0.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]


But if I do the same thing with:



a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]


so I removed one number in the end of the last dimension. Then I do this again:



b = np.array(a)
b[0] += 1
print(a)


Now a is changing, what I thought is the normal behavior in python.



a = [array([1. , 1.2, 1.4, 1.6, 1.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]


Can anybody explain me this?









share







New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 5





    This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

    – user2357112
    52 mins ago
















6















I have a problem with referencing to a numpy array.
I have an array of the form



import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]


and if I now create a new variable



b = np.array(a)


and do



b[0] += 1
print(a)


then a is not changing.



a = [array([0. , 0.2, 0.4, 0.6, 0.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]


But if I do the same thing with:



a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]


so I removed one number in the end of the last dimension. Then I do this again:



b = np.array(a)
b[0] += 1
print(a)


Now a is changing, what I thought is the normal behavior in python.



a = [array([1. , 1.2, 1.4, 1.6, 1.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]


Can anybody explain me this?









share







New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 5





    This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

    – user2357112
    52 mins ago














6












6








6


1






I have a problem with referencing to a numpy array.
I have an array of the form



import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]


and if I now create a new variable



b = np.array(a)


and do



b[0] += 1
print(a)


then a is not changing.



a = [array([0. , 0.2, 0.4, 0.6, 0.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]


But if I do the same thing with:



a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]


so I removed one number in the end of the last dimension. Then I do this again:



b = np.array(a)
b[0] += 1
print(a)


Now a is changing, what I thought is the normal behavior in python.



a = [array([1. , 1.2, 1.4, 1.6, 1.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]


Can anybody explain me this?









share







New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I have a problem with referencing to a numpy array.
I have an array of the form



import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]


and if I now create a new variable



b = np.array(a)


and do



b[0] += 1
print(a)


then a is not changing.



a = [array([0. , 0.2, 0.4, 0.6, 0.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]


But if I do the same thing with:



a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]


so I removed one number in the end of the last dimension. Then I do this again:



b = np.array(a)
b[0] += 1
print(a)


Now a is changing, what I thought is the normal behavior in python.



a = [array([1. , 1.2, 1.4, 1.6, 1.8]), 
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]


Can anybody explain me this?







python python-3.x numpy





share







New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share







New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share



share






New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 55 mins ago









shollisholli

312




312




New contributor




sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






sholli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 5





    This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

    – user2357112
    52 mins ago














  • 5





    This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

    – user2357112
    52 mins ago








5




5





This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

– user2357112
52 mins ago





This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.

– user2357112
52 mins ago












5 Answers
5






active

oldest

votes


















1














In the first case, NumPy sees that the input to numpy.array can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0] is a view of the new array's first row, completely independent of a[0], and modifying b[0] does not affect a[0].



In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0] is the same array object that a[0] is, and b[0] += 1 mutates that object.



This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.






share|improve this answer































    1














    In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.



    With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.



    With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.






    share|improve this answer































      1














      When you make a np.array with consistent lengths of lists, a new object np.ndarray of floats is created.



      Thus, your a[0] and b[0] does not share the same reference.



      a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
      np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
      np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
      b = np.array(a)
      id(a[0])
      # 139663994327728
      id(b[0])
      # 139663994324672


      However, with varying lengths of lists, np.array creates np.ndarray with object as its elements.



      a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]), 
      np.array([0. , 0.2, 0.4, 0.6, 0.8]),
      np.array([0. , 0.2, 0.4, 0.6])]
      b2 = np.array(a2)
      b2
      array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
      array([0. , 0.2, 0.4, 0.6])], dtype=object)


      Where b2 is still keeping the same references from a2:



      for s in a2:
      print(id(s))
      # 139663994330128
      # 139663994328448
      # 139663994329488

      for s in b2:
      print(id(s))
      # 139663994330128
      # 139663994328448
      # 139663994329488


      Which makes addition to b2[0] results in addition to a2[0].






      share|improve this answer































        0














        In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]), 
        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
        In [2]:
        In [2]: a
        Out[2]:
        [array([0. , 0.2, 0.4, 0.6, 0.8]),
        array([0. , 0.2, 0.4, 0.6, 0.8]),
        array([0. , 0.2, 0.4, 0.6, 0.8])]


        a is a list of arrays. b is a 2d array.



        In [3]: b = np.array(a)                                                         
        In [4]: b
        Out[4]:
        array([[0. , 0.2, 0.4, 0.6, 0.8],
        [0. , 0.2, 0.4, 0.6, 0.8],
        [0. , 0.2, 0.4, 0.6, 0.8]])
        In [5]: b[0] += 1
        In [6]: b
        Out[6]:
        array([[1. , 1.2, 1.4, 1.6, 1.8],
        [0. , 0.2, 0.4, 0.6, 0.8],
        [0. , 0.2, 0.4, 0.6, 0.8]])


        b gets values from a but does not contain any of the a objects. The underlying data structure of this b is very different from a, the list. If that isn't clear, you may want to review the numpy basics (which talk about shape, strides, and data buffers).



        In the second case, b is an object array, containing the same objects as a:



        In [8]: b = np.array(a)                                                         
        In [9]: b
        Out[9]:
        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
        array([0. , 0.2, 0.4, 0.6])], dtype=object)


        This b behaves a lot like the a - both contain arrays.



        The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.



        It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a, we have to do:



        In [17]: b = np.empty(3, object)                                                
        In [18]: b[:] = a[:]
        In [19]: b
        Out[19]:
        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
        array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)


        or even for i in range(3): b[i] = a[i]






        share|improve this answer

































          0














          The primary use-case for which numpy.array() has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.



          Whenever it is possible to do this, numpy.array() will indeed do it.



          When your a is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray() to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)) .



          So, any change to b[0] is actually a change to this internal data structure of numbers, which were all copied over from a.



          When your a is a list of unequally sized ndarrays, it is no longer possible for numpy.array() to convert this into an n-dimensional array of shape (3,5).



          So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object, and return a 1-dimensional ndarray of those objects. The length of this returned ndarray is 3 (the number of objects). You can see this by printing b.shape (will print (1,)) and b.dtype (will print object).



          In this case, numpy.array() does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of objects.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            sholli is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54760478%2fwhy-does-python-copy-numpy-arrays-where-the-length-of-the-dimensions-are-the-sam%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            5 Answers
            5






            active

            oldest

            votes








            5 Answers
            5






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            In the first case, NumPy sees that the input to numpy.array can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0] is a view of the new array's first row, completely independent of a[0], and modifying b[0] does not affect a[0].



            In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0] is the same array object that a[0] is, and b[0] += 1 mutates that object.



            This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.






            share|improve this answer




























              1














              In the first case, NumPy sees that the input to numpy.array can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0] is a view of the new array's first row, completely independent of a[0], and modifying b[0] does not affect a[0].



              In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0] is the same array object that a[0] is, and b[0] += 1 mutates that object.



              This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.






              share|improve this answer


























                1












                1








                1







                In the first case, NumPy sees that the input to numpy.array can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0] is a view of the new array's first row, completely independent of a[0], and modifying b[0] does not affect a[0].



                In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0] is the same array object that a[0] is, and b[0] += 1 mutates that object.



                This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.






                share|improve this answer













                In the first case, NumPy sees that the input to numpy.array can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0] is a view of the new array's first row, completely independent of a[0], and modifying b[0] does not affect a[0].



                In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0] is the same array object that a[0] is, and b[0] += 1 mutates that object.



                This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 46 mins ago









                user2357112user2357112

                154k12162256




                154k12162256

























                    1














                    In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.



                    With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.



                    With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.






                    share|improve this answer




























                      1














                      In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.



                      With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.



                      With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.






                      share|improve this answer


























                        1












                        1








                        1







                        In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.



                        With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.



                        With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.






                        share|improve this answer













                        In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.



                        With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.



                        With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.







                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered 44 mins ago









                        coldspeedcoldspeed

                        131k23138222




                        131k23138222























                            1














                            When you make a np.array with consistent lengths of lists, a new object np.ndarray of floats is created.



                            Thus, your a[0] and b[0] does not share the same reference.



                            a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                            np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                            np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                            b = np.array(a)
                            id(a[0])
                            # 139663994327728
                            id(b[0])
                            # 139663994324672


                            However, with varying lengths of lists, np.array creates np.ndarray with object as its elements.



                            a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]), 
                            np.array([0. , 0.2, 0.4, 0.6, 0.8]),
                            np.array([0. , 0.2, 0.4, 0.6])]
                            b2 = np.array(a2)
                            b2
                            array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                            array([0. , 0.2, 0.4, 0.6])], dtype=object)


                            Where b2 is still keeping the same references from a2:



                            for s in a2:
                            print(id(s))
                            # 139663994330128
                            # 139663994328448
                            # 139663994329488

                            for s in b2:
                            print(id(s))
                            # 139663994330128
                            # 139663994328448
                            # 139663994329488


                            Which makes addition to b2[0] results in addition to a2[0].






                            share|improve this answer




























                              1














                              When you make a np.array with consistent lengths of lists, a new object np.ndarray of floats is created.



                              Thus, your a[0] and b[0] does not share the same reference.



                              a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                              np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                              np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                              b = np.array(a)
                              id(a[0])
                              # 139663994327728
                              id(b[0])
                              # 139663994324672


                              However, with varying lengths of lists, np.array creates np.ndarray with object as its elements.



                              a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]), 
                              np.array([0. , 0.2, 0.4, 0.6, 0.8]),
                              np.array([0. , 0.2, 0.4, 0.6])]
                              b2 = np.array(a2)
                              b2
                              array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                              array([0. , 0.2, 0.4, 0.6])], dtype=object)


                              Where b2 is still keeping the same references from a2:



                              for s in a2:
                              print(id(s))
                              # 139663994330128
                              # 139663994328448
                              # 139663994329488

                              for s in b2:
                              print(id(s))
                              # 139663994330128
                              # 139663994328448
                              # 139663994329488


                              Which makes addition to b2[0] results in addition to a2[0].






                              share|improve this answer


























                                1












                                1








                                1







                                When you make a np.array with consistent lengths of lists, a new object np.ndarray of floats is created.



                                Thus, your a[0] and b[0] does not share the same reference.



                                a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                b = np.array(a)
                                id(a[0])
                                # 139663994327728
                                id(b[0])
                                # 139663994324672


                                However, with varying lengths of lists, np.array creates np.ndarray with object as its elements.



                                a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]), 
                                np.array([0. , 0.2, 0.4, 0.6, 0.8]),
                                np.array([0. , 0.2, 0.4, 0.6])]
                                b2 = np.array(a2)
                                b2
                                array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                Where b2 is still keeping the same references from a2:



                                for s in a2:
                                print(id(s))
                                # 139663994330128
                                # 139663994328448
                                # 139663994329488

                                for s in b2:
                                print(id(s))
                                # 139663994330128
                                # 139663994328448
                                # 139663994329488


                                Which makes addition to b2[0] results in addition to a2[0].






                                share|improve this answer













                                When you make a np.array with consistent lengths of lists, a new object np.ndarray of floats is created.



                                Thus, your a[0] and b[0] does not share the same reference.



                                a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                b = np.array(a)
                                id(a[0])
                                # 139663994327728
                                id(b[0])
                                # 139663994324672


                                However, with varying lengths of lists, np.array creates np.ndarray with object as its elements.



                                a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]), 
                                np.array([0. , 0.2, 0.4, 0.6, 0.8]),
                                np.array([0. , 0.2, 0.4, 0.6])]
                                b2 = np.array(a2)
                                b2
                                array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                Where b2 is still keeping the same references from a2:



                                for s in a2:
                                print(id(s))
                                # 139663994330128
                                # 139663994328448
                                # 139663994329488

                                for s in b2:
                                print(id(s))
                                # 139663994330128
                                # 139663994328448
                                # 139663994329488


                                Which makes addition to b2[0] results in addition to a2[0].







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered 42 mins ago









                                ChrisChris

                                1,935317




                                1,935317























                                    0














                                    In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]), 
                                    ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                    ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                    In [2]:
                                    In [2]: a
                                    Out[2]:
                                    [array([0. , 0.2, 0.4, 0.6, 0.8]),
                                    array([0. , 0.2, 0.4, 0.6, 0.8]),
                                    array([0. , 0.2, 0.4, 0.6, 0.8])]


                                    a is a list of arrays. b is a 2d array.



                                    In [3]: b = np.array(a)                                                         
                                    In [4]: b
                                    Out[4]:
                                    array([[0. , 0.2, 0.4, 0.6, 0.8],
                                    [0. , 0.2, 0.4, 0.6, 0.8],
                                    [0. , 0.2, 0.4, 0.6, 0.8]])
                                    In [5]: b[0] += 1
                                    In [6]: b
                                    Out[6]:
                                    array([[1. , 1.2, 1.4, 1.6, 1.8],
                                    [0. , 0.2, 0.4, 0.6, 0.8],
                                    [0. , 0.2, 0.4, 0.6, 0.8]])


                                    b gets values from a but does not contain any of the a objects. The underlying data structure of this b is very different from a, the list. If that isn't clear, you may want to review the numpy basics (which talk about shape, strides, and data buffers).



                                    In the second case, b is an object array, containing the same objects as a:



                                    In [8]: b = np.array(a)                                                         
                                    In [9]: b
                                    Out[9]:
                                    array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                    array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                    This b behaves a lot like the a - both contain arrays.



                                    The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.



                                    It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a, we have to do:



                                    In [17]: b = np.empty(3, object)                                                
                                    In [18]: b[:] = a[:]
                                    In [19]: b
                                    Out[19]:
                                    array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                    array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)


                                    or even for i in range(3): b[i] = a[i]






                                    share|improve this answer






























                                      0














                                      In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]), 
                                      ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                      ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                      In [2]:
                                      In [2]: a
                                      Out[2]:
                                      [array([0. , 0.2, 0.4, 0.6, 0.8]),
                                      array([0. , 0.2, 0.4, 0.6, 0.8]),
                                      array([0. , 0.2, 0.4, 0.6, 0.8])]


                                      a is a list of arrays. b is a 2d array.



                                      In [3]: b = np.array(a)                                                         
                                      In [4]: b
                                      Out[4]:
                                      array([[0. , 0.2, 0.4, 0.6, 0.8],
                                      [0. , 0.2, 0.4, 0.6, 0.8],
                                      [0. , 0.2, 0.4, 0.6, 0.8]])
                                      In [5]: b[0] += 1
                                      In [6]: b
                                      Out[6]:
                                      array([[1. , 1.2, 1.4, 1.6, 1.8],
                                      [0. , 0.2, 0.4, 0.6, 0.8],
                                      [0. , 0.2, 0.4, 0.6, 0.8]])


                                      b gets values from a but does not contain any of the a objects. The underlying data structure of this b is very different from a, the list. If that isn't clear, you may want to review the numpy basics (which talk about shape, strides, and data buffers).



                                      In the second case, b is an object array, containing the same objects as a:



                                      In [8]: b = np.array(a)                                                         
                                      In [9]: b
                                      Out[9]:
                                      array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                      array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                      This b behaves a lot like the a - both contain arrays.



                                      The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.



                                      It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a, we have to do:



                                      In [17]: b = np.empty(3, object)                                                
                                      In [18]: b[:] = a[:]
                                      In [19]: b
                                      Out[19]:
                                      array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                      array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)


                                      or even for i in range(3): b[i] = a[i]






                                      share|improve this answer




























                                        0












                                        0








                                        0







                                        In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]), 
                                        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                        In [2]:
                                        In [2]: a
                                        Out[2]:
                                        [array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8])]


                                        a is a list of arrays. b is a 2d array.



                                        In [3]: b = np.array(a)                                                         
                                        In [4]: b
                                        Out[4]:
                                        array([[0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8]])
                                        In [5]: b[0] += 1
                                        In [6]: b
                                        Out[6]:
                                        array([[1. , 1.2, 1.4, 1.6, 1.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8]])


                                        b gets values from a but does not contain any of the a objects. The underlying data structure of this b is very different from a, the list. If that isn't clear, you may want to review the numpy basics (which talk about shape, strides, and data buffers).



                                        In the second case, b is an object array, containing the same objects as a:



                                        In [8]: b = np.array(a)                                                         
                                        In [9]: b
                                        Out[9]:
                                        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                        This b behaves a lot like the a - both contain arrays.



                                        The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.



                                        It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a, we have to do:



                                        In [17]: b = np.empty(3, object)                                                
                                        In [18]: b[:] = a[:]
                                        In [19]: b
                                        Out[19]:
                                        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)


                                        or even for i in range(3): b[i] = a[i]






                                        share|improve this answer















                                        In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]), 
                                        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
                                        ...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
                                        In [2]:
                                        In [2]: a
                                        Out[2]:
                                        [array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8])]


                                        a is a list of arrays. b is a 2d array.



                                        In [3]: b = np.array(a)                                                         
                                        In [4]: b
                                        Out[4]:
                                        array([[0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8]])
                                        In [5]: b[0] += 1
                                        In [6]: b
                                        Out[6]:
                                        array([[1. , 1.2, 1.4, 1.6, 1.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8],
                                        [0. , 0.2, 0.4, 0.6, 0.8]])


                                        b gets values from a but does not contain any of the a objects. The underlying data structure of this b is very different from a, the list. If that isn't clear, you may want to review the numpy basics (which talk about shape, strides, and data buffers).



                                        In the second case, b is an object array, containing the same objects as a:



                                        In [8]: b = np.array(a)                                                         
                                        In [9]: b
                                        Out[9]:
                                        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6])], dtype=object)


                                        This b behaves a lot like the a - both contain arrays.



                                        The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.



                                        It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a, we have to do:



                                        In [17]: b = np.empty(3, object)                                                
                                        In [18]: b[:] = a[:]
                                        In [19]: b
                                        Out[19]:
                                        array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
                                        array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)


                                        or even for i in range(3): b[i] = a[i]







                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited 38 mins ago

























                                        answered 44 mins ago









                                        hpauljhpaulj

                                        113k783151




                                        113k783151























                                            0














                                            The primary use-case for which numpy.array() has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.



                                            Whenever it is possible to do this, numpy.array() will indeed do it.



                                            When your a is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray() to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)) .



                                            So, any change to b[0] is actually a change to this internal data structure of numbers, which were all copied over from a.



                                            When your a is a list of unequally sized ndarrays, it is no longer possible for numpy.array() to convert this into an n-dimensional array of shape (3,5).



                                            So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object, and return a 1-dimensional ndarray of those objects. The length of this returned ndarray is 3 (the number of objects). You can see this by printing b.shape (will print (1,)) and b.dtype (will print object).



                                            In this case, numpy.array() does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of objects.






                                            share|improve this answer




























                                              0














                                              The primary use-case for which numpy.array() has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.



                                              Whenever it is possible to do this, numpy.array() will indeed do it.



                                              When your a is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray() to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)) .



                                              So, any change to b[0] is actually a change to this internal data structure of numbers, which were all copied over from a.



                                              When your a is a list of unequally sized ndarrays, it is no longer possible for numpy.array() to convert this into an n-dimensional array of shape (3,5).



                                              So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object, and return a 1-dimensional ndarray of those objects. The length of this returned ndarray is 3 (the number of objects). You can see this by printing b.shape (will print (1,)) and b.dtype (will print object).



                                              In this case, numpy.array() does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of objects.






                                              share|improve this answer


























                                                0












                                                0








                                                0







                                                The primary use-case for which numpy.array() has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.



                                                Whenever it is possible to do this, numpy.array() will indeed do it.



                                                When your a is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray() to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)) .



                                                So, any change to b[0] is actually a change to this internal data structure of numbers, which were all copied over from a.



                                                When your a is a list of unequally sized ndarrays, it is no longer possible for numpy.array() to convert this into an n-dimensional array of shape (3,5).



                                                So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object, and return a 1-dimensional ndarray of those objects. The length of this returned ndarray is 3 (the number of objects). You can see this by printing b.shape (will print (1,)) and b.dtype (will print object).



                                                In this case, numpy.array() does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of objects.






                                                share|improve this answer













                                                The primary use-case for which numpy.array() has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.



                                                Whenever it is possible to do this, numpy.array() will indeed do it.



                                                When your a is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray() to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)) .



                                                So, any change to b[0] is actually a change to this internal data structure of numbers, which were all copied over from a.



                                                When your a is a list of unequally sized ndarrays, it is no longer possible for numpy.array() to convert this into an n-dimensional array of shape (3,5).



                                                So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object, and return a 1-dimensional ndarray of those objects. The length of this returned ndarray is 3 (the number of objects). You can see this by printing b.shape (will print (1,)) and b.dtype (will print object).



                                                In this case, numpy.array() does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of objects.







                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered 13 mins ago









                                                fountainheadfountainhead

                                                1719




                                                1719






















                                                    sholli is a new contributor. Be nice, and check out our Code of Conduct.










                                                    draft saved

                                                    draft discarded


















                                                    sholli is a new contributor. Be nice, and check out our Code of Conduct.













                                                    sholli is a new contributor. Be nice, and check out our Code of Conduct.












                                                    sholli is a new contributor. Be nice, and check out our Code of Conduct.
















                                                    Thanks for contributing an answer to Stack Overflow!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54760478%2fwhy-does-python-copy-numpy-arrays-where-the-length-of-the-dimensions-are-the-sam%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Olav Thon

                                                    Waikiki

                                                    Tårekanal