Matthew L. Wright
Assistant Professor, St. Olaf College

Homework 15: Python Dictionaries

CS 121 ⋅ Spring 2016

The following exercises involve Python files, strings, and dictionaries. Upload the Python files containing your solutions to Moodle for HW15.

  1. Anagrams are words or phrases that have the same letters, but in a different order. For example, the following pairs are anagrams: "arm" and "ram", "agrees" and "grease", and "debit card" and "bad credit".

    Write a program that asks the user to enter two words or phrases, and says whether or not they are anagrams.

    Hint: use the function countLetters() from class.

    For example, the Python console might look something like this when you run your program:

    Enter a word or phrase: debit card
    Enter another word or phrase: bad credit
    Enter a word or phrase: test
    Enter another word or phrase: dog
    Not anagrams.
  2. Write a function removeNonAlpha(text) that removes all non-alphabetic characters from a string. For example:

    removeNonAlpha("abc123.") returns "abc"

    removeNonAlpha("Python!") returns "Python"

    removeNonAlpha("red, white, and blue") returns "redwhiteandblue"

    Note: You don't need to use dictionaries to write this function. You will use this function in the next problem, which does involve dictionaries.

  3. Write a program that reads a text file and determines all unique words in the file, and the number of times that each word appears.

    To do this, first review the countLetters() function from class. This time, however, you will need to first read a file, split each line in the file by white space, and then use your removeNonAlpha() function from above to remove numbers and punctuation from each word. Then use a dictionary to store the unique words and to count the number of times that they occur.

    After counting all the words in the file, your program should save the words (in alphabetical order) and their counts to a file called word_counts.txt.

    You can test your program on the following files: shakespeare.txt and milton.txt.

    If you run your program on milton.txt, your the first ten lines of your output file should be:

    a 57
    abarim 1
    abasht 1
    abbana 1
    abide 1
    abject 2
    abominations 1
    about 1
    above 7
    abusd 1