[colug-432] help with awk

Rick Hornsby richardjhornsby at gmail.com
Tue Oct 6 20:29:25 EDT 2015


> On Oct 6, 2015, at 18:57, Keith Larson <klarson at k12group.net> wrote:
> 
> I think that awk is the right tool, but I'm open to a better solution if there is one.
> 
> I have a .csv format file that I want to split into two files based on the value found in one of the columns.  For discussion purposes, say that the file has several different values about a person.  FirstName, LastName, Phone, DOB and Gender.  The file does have a header row.  I would like two files, one with all entries where gender is M and another file where gender is F.  I would like the header row in both files if possible.  The key is that I want the row to be kept completely intact with the delimiters remaining in place.
> 
> Any suggested solutions?  Ideally, I want something that can be done in a bash script so that I can run this on a nightly schedule.

awk is generally the right tool for field delimited, data, yes.  But going by what you suggest the file looks like

John,Smith,6145551212,01011980,M
Lucy,Jones,6145551213,01011980,F

there’s a simple solution if you’re ensured that a single letter at the end of the line is your gender

grep ‘,F$’ source.csv > females.csv
grep ‘,M$’ source.csv > males.csv

If you need to do more sophisticated processing based on the fields, you can do something like this on each line (watch the IFS / whitespace issues with $line):

gender=$(echo “$line" | awk -F, {print $4})
case $gender in
   M)
       # do something here
       echo $line >> $males
   F)
       # do something else here
       echo $line >> $females
esac



… or what Jeff suggested.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.colug.net/pipermail/colug-432/attachments/20151006/fa8e1bb7/attachment.html 


More information about the colug-432 mailing list