ENH: implement voidtype_repr and voidtype_str#8981
Conversation
|
This needs some kind of prefix/suffix to make non-alphabetic hex-strings distinguishable from ints. But hex seems like the right output to me |
|
Maybe like |
|
It would also be nice if the or just taking advantage of the existing bytes->void conversion, at the cost of verbosity (using uppercase for readability) Of course, |
|
Hmm that's pretty good. Let me try that. |
|
While we're at it, |
|
if its hex you should prefix it with |
There was a problem hiding this comment.
as this is private api, does pypy have it?
There was a problem hiding this comment.
Probably not.. I was going to update it once we dicided what format we want.
There was a problem hiding this comment.
Could fall back on the "invoke something in _internal.py" strategy again - repr doesn't need to be incredibly fast
|
@juliantaylor: I'm not a fan of So then we're left comparing:
The first option is surprising, because passing bytes to void (py2) should give a buffer. So we're faced with the last two, and the first is just shorter |
|
hm yes as void can represent arbitrary bytes I like the |
|
I am actually trying out your idea of I like that because it gives back the original array. |
There was a problem hiding this comment.
you can just call PyBytes_FromStringAndSize on obval here (it is defined in our compat headers to work in python2 too)
There was a problem hiding this comment.
probably the same in this case, but shouldn't this be PyObject_Repr here?
There was a problem hiding this comment.
yes, already changed on my side
03408ec to
fa50ebd
Compare
|
I still need to figure out what is going on with 0-d arrays. Apparently they are printed totally differently here which I want to try to fix. >>> a = np.zeros(4, dtype='V4')
>>> a
array([b'\x00\x00\x00\x00', b'\x00\x00\x00\x00', b'\x00\x00\x00\x00',
b'\x00\x00\x00\x00'],
dtype='|V4')
>>> np.array(a[0])
array(array([0, 0, 0, 0], dtype=int8),
dtype='|V4') |
There was a problem hiding this comment.
This isn't enough - I think that this should be our output:
>>> np.array("Hello world").view(np.void)
array(b'\x68\x65\x6C\x6C\x6F\x20\x77\x6F\x72\x6C\x64', dtype=void)
Showing ascii characters as text isn't useful - we already have the bytes_ dtype for that
There was a problem hiding this comment.
Which you can produce with:
"b'{}'".format(''.join(map(r'\x{:02X}'.format, bytearray( voiditem.tobytes() ))))
I think that's a result of the |
Yes, that's right, since That code block (for |
|
Missed your link there |
There was a problem hiding this comment.
I think maybe just do this always - 2.7 supports the syntax, and that saves confusion when from __future__ import unicode_literals is in place. extrachars would go too, as a result
There was a problem hiding this comment.
This is lowercase - which makes separating the hex chars from the x a little harder on the eye
There was a problem hiding this comment.
lowercase seems to be the python standard though
There was a problem hiding this comment.
If nothing else, I think this function should have a comment pointing out that the string it produces is lowercase.
Uppercase is pretty typical of hex memory viewers, isn't it? If anything, being inconsistent with bytes.__repr__ makes it more obvious to the user that they're looking at a void
There was a problem hiding this comment.
Not sure how I feel about void.__str__ looking like byte.__repr__. Seems that here printing out raw hex might be better
There was a problem hiding this comment.
In the form void(hex='1b5b324b07410a08'), are you suggesting we modify the void constructor to add a hex argument? If not, I don't like the fact that the repr couldn''t actually be converted to an instance.
There was a problem hiding this comment.
Yes, I was indeed suggesting we allow it to take a kwarg. But that was an alternative to the bytes solution you settled for, and not my point here.
Here I'm suggesting something like:
>>> print(repr(v))
'\x12\x34\xAB`
>>> print(str(v))
12 34 AB
Which is sort of consistent with
>>> s = 'hello world'
>>> print(repr(v))
'hello world'
>>> print(str(v))
hello world
|
After looking at fixing the 0d array problem, it looks like a more general and more difficult problem better fixed in a future other PR. We seem to have double implementations of many of the dtype reprs: once in |
|
Wonderful. I agree, that can go in another PR |
25438a2 to
7e39112
Compare
|
Saw this as it had drifted up in the sort-by-update list. Nice! I briefly wondered about documentation beyond the release notes, but it seems there are no examples where void arrays are shown. |
466d620 to
3248ed9
Compare
|
Release notes updated |
doc/release/1.14.0-notes.rst
Outdated
There was a problem hiding this comment.
the void type was always customizable, but before it was done through the (still) mis-documented 'numpystr' key
numpy/core/tests/test_arrayprint.py
Outdated
There was a problem hiding this comment.
Can we add an eval(repr(x), vars(np)) == x test, like we have elsewhere, for both scalars and arrays? That's one of the features of this patch too
|
Good idea, done. |
|
Needs rebase. |
|
Sounds like this is ready. Needs a rebase. |
|
Rebased |
|
Merged, thanks Allan. |
|
|
||
|
|
||
| def _block_check_depths_match(arrays, parent_index=[]): | ||
| class _Recurser(object): |
There was a problem hiding this comment.
Uh oh - bad merge / rebase - this just reverted #9667
This restores the changes in numpygh-9667 that were overwritten.
This PR implements
voidtype_reprandvoidtype_strto output something more sensible.Currently, unstructured void types print their raw values directly to output, allowing you to do some funny things:
(if you paste this into an ANSI terminal the output willl be red colored and the terminal will beep)
In this PR I've implemented that the hex representation of the byte-string will be printed:
I also toyed with printing something like
<memory>,<V8>,<memory at 123456>for the elements. Better suggestions are welcome.