4
\$\begingroup\$

I would like to show the tree-like structure contained in a multi-dimensional tuple by using the set of graphical glyphs that exist in Unicode fonts (e.g. \u2500, \u2502, \u2514 and others). Here is an example of the kind of output, I'm talking about:

>>> showtree('(01, (02,  03, 04), 05, ((06, 07), 08), (09, (10, 11), (12, 13), 14), 15)')
┬─01
├─┬─02
│ ├─03
│ └─04
├─05
├─┬─┬─06
│ │ └─07
│ └─08
├─┬─09
│ ├─┬─10
│ │ └─11
│ ├─┬─12
│ │ └─13
│ └─14
└─15

Ideally, empty items, multi-words items, extra spaces around commas or parenthesis, as well as unbalanced parenthesis should be managed correctly, as in the following example:

>>> showtree('( A, B multi-word item , (C,D), ( E  , ,  F ), G  )'))
┬─A
├─B multi-word item
├─┬─C
│ └─D
├─┬─E
│ ├─
│ └─F
└─G

I came up with a solution that works quite well (the examples above have been generated with it) but the implemented algorithm is not very elegant, and the code not very pythonic :

def showtree(string):
    """display multidimensional tuple as a tree"""
    glyphs = ['\u252C\u2500','\u251C\u2500','\u2514\u2500','\u2502 ']
    tree, glyph, item = [], [], []
    for char in string:
        if char in ',)' and item:                                   # add glyph prefix and current item to tree
            tree.append(glyph + [''.join(item).strip()]); item = []
        if char == ',':                                             # update glyph prefix for new item in sublist
            glyph = [glyphs[3]] * (len(glyph)-1) + [glyphs[1]]
        elif char == ')':                                           # update glyph prefix for last item in sublist
            tree[-1][-2] = glyphs[2]; glyph = glyph[:-1] 
        elif char == '(':                                           # update glyph prefix for first item in sublist
            glyph.append(glyphs[0])
        else:                                                       # other chars are simply added to current item
            item.append(char)
    return '\n'.join(''.join(node) for node in tree)

So I would like to get some ideas for an improved implementation, maybe using regular expressions or other advanced parsing techniques. Thank you very much for any hint...

\$\endgroup\$
3
  • \$\begingroup\$ when i hear "tree" i think recursion... (don't know yet if it's useful in this case...) btw. you have a function named 'tree' and a variable also named 'tree', that is confusing. \$\endgroup\$
    – Jan Kuiken
    Commented Feb 27, 2020 at 16:56
  • \$\begingroup\$ @JanKuiken: Recursion was my first attempt, but I couldn't manage it properly, so I came back to a solution where the glyph stack is managed by hand, with a standard Python list. \$\endgroup\$ Commented Feb 27, 2020 at 17:01
  • \$\begingroup\$ @JanKuiken: I renamed the actual name of the function when I pasted the code here, and didn't notice the name collision. I've edited the OP to remove collision. Thanks. \$\endgroup\$ Commented Feb 27, 2020 at 17:07

2 Answers 2

2
\$\begingroup\$

Neither the code provided in the question or the top voted answer work correctly. This is as you're making unreasonable assumptions on how the data is formed.

>>> print(showtree('(A)'))
└─A
>>> print(showtree('(1, ((2)), (3, 4, (5)))'))
┬─1
├─┬─└─2
├─┬─3
│ ├─4
│ ├─└─5
>>> print(tree_string_to_display_string('(1, ((2)), (3, 4, (5)))'))
┬─1
├─┬─└─2
├─┬─3
│ ├─4
│ ├─└─5

Your code is also doing two things at once, parsing a string and building a tree. And so I will only take a tuple as input. If you need to take a string then you can build the parser separately.

  1. Make the code work with one item.

    To do this should be fairly simple, we take the tuple ('1',) and return ──1.

    def build_tree(root):
        return '\n'.join(_build_tree(root))
    def _build_tree(node):
        yield '──' + node[0]
    
    >>> print(build_tree(('1',)))
    ──1
    
  2. Make the code work with 1, 2 or more items.

    This is fairly simple, we check node's length and use the above code if it's 1. Otherwise we make:

    • The first item start with ┬─.
    • The last item start with └─.
    • All other items start with ├─.

    This is as simple as using tuple unpacking, and then iterating over the middle.

    def _build_tree(node):
        if len(node) == 1:
            yield '──' + node[0]
            return
        start, *mid, end = node
        yield '┬─' + start
        for value in mid:
            yield '├─' + value
        yield '└─' + end
    
    >>> print(build_tree(('1',)))
    ──1
    >>> print(build_tree(('1', '2')))
    ┬─1
    └─2
    >>> print(build_tree(('1', '2', '3')))
    ┬─1
    ├─2
    └─3
    
  3. Get the code to work if you only enter a value, no tuples.

    This is simple, since we're only allowing tuples as the nesting datatype, then we can just add an if not isinstance(node, tuple). We will convert this to a string now to help our future selves.

    def _build_tree(node):
        if not isinstance(node, tuple):
            yield str(node)
            return
        if len(node) == 1:
            yield '──' + node[0]
            return
        start, *mid, end = node
        yield '┬─' + start
        for value in mid:
            yield '├─' + value
        yield '└─' + end
    
    >>> print(build_tree('1'))
    1
    
  4. Get the code to work recursively with the same input as (2). To check this we will change the input to integers.

    This is simple. We run _build_tree on all the values in node, if it's a tuple. From here we only work on these values. We know these values are going to be iterators with only one value. This means we can just use next for now.

    def _build_tree(node):
        if not isinstance(node, tuple):
            yield str(node)
            return
        values = [_build_tree(n) for n in node]
        if len(values) == 1:
            yield '──' + next(values[0])
            return
        start, *mid, end = values
        yield '┬─' + next(start)
        for value in mid:
            yield '├─' + next(value)
        yield '└─' + next(end)
    
    >>> print(build_tree((1,)))
    ──1
    >>> print(build_tree((1, 2)))
    ┬─1
    └─2
    >>> print(build_tree((1, 2, 3)))
    ┬─1
    ├─2
    └─3
    
  5. Get the code working recursively.

    We know all the current yields work the same way, and so this calls for a new function. This should take three values:

    1. The value to add to the first item. (This is what we're doing right now)
    2. The value to add on all other items.
      This is important as if a node is only one large. But has nested data that is larger than one value then we will add ' ' to the output.
    3. The nested data.

    This is really simple to build:

    def build_lines(first, other, values):
        yield first + next(values)
        for value in values:
            yield other + value
    

    Finally we adjust the current yields so they are yield from functions.

def build_tree(root):
    return '\n'.join(_build_tree(root))
def _build_tree(node):
    if not isinstance(node, tuple):
        yield str(node)
        return
    values = [_build_tree(n) for n in node]
    if len(values) == 1:
        yield from build_lines('──', '  ', values[0])
        return
    start, *mid, end = values
    yield from build_lines('┬─', '│ ', start)
    for value in mid:
        yield from build_lines('├─', '│ ', value)
    yield from build_lines('└─', '  ', end)
def build_lines(first, other, values):
    yield first + next(values)
    for value in values:
        yield other + value
>>> print(build_tree(('01', ('02',  '03', '04'), '05', (('06', '07'), '08'), ('09', ('10', '11'), ('12', '13'), '14'), '15')))
┬─01
├─┬─02
│ ├─03
│ └─04
├─05
├─┬─┬─06
│ │ └─07
│ └─08
├─┬─09
│ ├─┬─10
│ │ └─11
│ ├─┬─12
│ │ └─13
│ └─14
└─15
>>> print(build_tree(('A', 'B multi-word item', ('C', 'D'), ('E', '', 'F'), 'G')))
┬─A
├─B multi-word item
├─┬─C
│ └─D
├─┬─E
│ ├─
│ └─F
└─G
>>> print(build_tree((1, ((2,),), (3, 4, (5,)))))
┬─1
├─────2
└─┬─3
  ├─4
  └───5
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Woow ! This is the first time that I submit some unsatisfactory code of mine to CodeReview, and the result is brilliant.Tuple unpacking with the *-op in the middle is exactly what I was seeking for. And using a recursive generator seems so obvious, once you see it on screen. By the way, I need this tree-like display in a context where I don't want the user to enclose all items in quotes to create well-formed strings. But as you suggested, it's much easier to convert the loose syntax string into a well-formed tuple, then call your recursive fonction. Thanks a lot for this (master)piece of code. \$\endgroup\$ Commented Feb 28, 2020 at 20:49
1
\$\begingroup\$

I'm still thinking about recursion, but i think you nailed it with your solution!

However to better understand your code I have made some cosmetically changes:

  • removed the glyphs list and used string literals when glyphs was used
  • changed variable name tree to display_rows
  • changed variable name glyph to prefix
  • changed the type of item to string
  • added more 'air' to the code by using empty lines and replacing comments to keep them within the 80 characters limit

This led to:

def tree_string_to_display_string(tree_string):
    """change 'tuple-tree-string' to a string for display"""
    display_rows = []
    prefix = []
    item = ''
    for char in tree_string:
        if char in ',)' and item:
            # add glyph prefix and current item to tree
            display_rows.append(prefix + [item.strip()])
            item = ''
        if char == ',':
            # update glyph prefix for new item in sublist
            prefix = ['│ '] * (len(prefix)-1) + ['├─']
        elif char == ')':
            # update glyph prefix for last item in sublist
            display_rows[-1][-2] = '└─'
            prefix = prefix[:-1] 
        elif char == '(':
            # update glyph prefix for first item in sublist
            prefix.append('┬─')
        else:
            # other chars are simply added to current item
            item += char
    return '\n'.join(''.join(node) for node in display_rows)
s = '( A, B multi-word item , (C,D), ( E  , ,  F ), G  )'
print(tree_string_to_display_string(s))
s = '(01, (02,  03, 04), 05, ((06, 07), 08), (09, (10, 11), (12, 13), 14), 15)'
print(tree_string_to_display_string(s))
\$\endgroup\$
1
  • \$\begingroup\$ Oops: found a bug (/issue), string '( A, (B,C) , D)' gives desired result, however string '( A, (B,C))' does not... \$\endgroup\$
    – Jan Kuiken
    Commented Feb 27, 2020 at 19:38

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.