Python csv count items in one column based on the name of another column -
i new programming in python. have large csv file (~5k items). there 2 columns need data counted. best way explain need show few rows of csv:
name column optionaldata5 column collaborative desk broward collaborative desk broward academic desk broward academic desk broward academic desk broward academic desk broward collaborative desk broward collaborative desk broward collaborative desk broward collaborative desk broward broward broward alachua alachua collaborative desk alachua collaborative desk alachua collaborative desk alachua collaborative desk alachua collaborative desk alachua
in above example want result follows:
broward: collaborative desk - 6 academic desk - 4 broward - 1 alachua: collaborative desk - 5 alachua - 1
maybe total , on next library in spreadsheet.
i started writing code wondering if there better way this.
assuming data tab delimited, 1 way of getting want:
import csv collections import defaultdict, counter input_file = open('data') csv_reader = csv.reader(input_file, delimiter='\t') data = defaultdict(list) row in csv_reader: data[row[1]].append(row[0])
the data contain:
{'alachua': ['alachua', 'collaborative desk', 'collaborative desk', 'collaborative desk', 'collaborative desk', 'collaborative desk'], 'broward': ['collaborative desk', 'collaborative desk', 'academic desk', 'academic desk', 'academic desk', 'academic desk', 'collaborative desk', 'collaborative desk', 'collaborative desk', 'collaborative desk', 'broward']}
you can iterate on value list each key , total count, or use counter
method in python as:
for k, v in data.items(): print k print counter(v)
this prints:
alachua counter({'collaborative desk': 5, 'alachua': 1}) broward counter({'collaborative desk': 6, 'academic desk': 4, 'broward': 1})
Comments
Post a Comment