Project 1: File Format Conversions


Converting between 3 plain text data representations: CSV, JSON, and XML. You need to write Python 3 code in the starter code provided that can read in a given file, save them into some common internal representation, and output them in the desired format. You may only use built in modules of Python 3. And, we strongly encourage you to make use of the "csv", "json", and "xml.etree" modules.

What needs to be done

You need to submit a file named "". There are 6 functions that need to be implemented.

To run your code, I've included some code in the starter file that runs one of the testcases. You are welcome to modify/delete this code as you see fit. What each of the test cases do is provide a formatted string to one of the read_*_string functions, saving the returned object to a variable, then passing this object to one of the write_*_string functions and checking that the string returned is formatted correctly. All possible combinations of read and write functions are tested.

Intermediate Object

What should the object returned by the read_*_string functions (and loaded into the write_*_string function)? It is up to you, think about what information needs to be stored to determine the correct output. You can use lists, dictionaries, tuples, custom objects, or anything else. My test cases don't care about the form of the object.


Order of Columns/Attributes

Although the order of the columns/attributes doesn't matter in practice, for ease of testing, order the columns/attributes of file in lexicographical order.

CSV Specific Instruction

You need to have a header line denoting the columns.

JSON Specific Instructions

The tabular JSON format that you will be using for input and output involves a file with a single array. For each record (e.g. line in a csv file), it becomes one object in the JSON array. A record object has a property (i.e. key) for each column, and a value which is the value of that record.

XML Specific Instructions

XML has a lot of freedom with how you structure your data. For this project, in each XML file, there should be a single "data" node, with as many "record" nodes within it as needed. In each record, there will be an column node corresponding to each column. In each column node, the text content should be the value for that record. There should be no attributes for any element.


          <name>Josh Nahum</name>
          <name>Tyler Derr</name>

Note: This output has been prettified, the correct output is all on one line (no extraneous whitespace).

What are the example datasets?

There are two example data files provided for you (from with some minor modifications):

  • "*crime*" contains 7,584 crime records, as made available by the Sacramento Police Department.
  • "*realestate.*" is a list of 985 real estate transactions in the Sacramento area reported over a five-day period, as reported by the Sacramento Bee.