View Forum Topic

Forums
CS4660
Word counts using a list comprehension

Author	Message
rabbott Posts: 1649	Posted 12:20 Aug 26, 2018 \| Yesterday Luis Fisher said he thought it was possible to generate a word count using a list comprehension. I was skeptical. Luis was right, but as far as I can tell, the only way to do it is to rely on side effects -- which is not good programming style. `def do_side_effect(key, d): d[key] = d.get(key, 0) + 1 return (key, d[key]) words = ['a', 'ab', 'a', 'ad', 'sdfsf', 'a', 'sdfsf', 'ab'] counts = {}` `steps = [do_side_effect(word, counts) for word in words] print('steps:', steps) print('counts:', counts)` `=>` steps: [('a', 1), ('ab', 1), ('a', 2), ('ad', 1), ('sdfsf', 1), ('a', 3), ('sdfsf', 2), ('ab', 2)] counts: {'a': 3, 'ab': 2, 'ad': 1, 'sdfsf': 2} When run, the list comprehension generates a list of the `steps` the list comprehension takes. The actual count is stored in `counts`, which is updated as a side-effect of running the list comprehension, each time `do_side_effect` is called. As far as the list comprehension is concerned, the job of `do_side_effect` is to generate a value to put into the list being constructed. The value `do_side_effect` returns is a pair, whose first element is the current word, and whose second element is the then-current count of that word. As a side effect, `do_side_effect` updates the dictionary of counts for that word. This is not the intended use of a list comprehension, but it does job. Last edited by rabbott at 13:09 Aug 26, 2018.
jungsoolim Posts: 38	Posted 22:17 Aug 26, 2018 \| Can we put list comprehension in the body of function? def count_word_freq(words): return (dict(zip(words, (words.count(k) for k in words)))) words = ['a', 'ab', 'a', 'ad', 'sdfsf', 'a', 'sdfsf', 'ab'] print(count_word_freq(words)) If so, wouldn't this function be less efficient that the function using a loop? (i.e.) def word_frequencies(mylist): result = {} for k in mylist: result[k] = result.setdefault(k, 0) + 1 return result print(word_frequencies(words))
rabbott Posts: 1649	Posted 22:51 Aug 26, 2018 \| Definitely less efficient. It's O(n^2) since you have to scan words for each word. Even if you did something like what Jay did, it's still O(n^2). `def count_word_freq(words): return {w:words.count(w) for w in set(words)}` `words = ['a', 'ab', 'a', 'ad', 'sdfsf', 'a', 'sdfsf', 'ab'] print(count_word_freq(words))` Assuming dictionary access is O(1), the side-effect version and the for-loop version are both O(n).