bugshaw: (BugCount)
Add MemoryShare This Entry
posted by [personal profile] bugshaw at 02:59pm on 11/09/2009
I have a lot of CSV files (all with the same structure) to import into Excel and I need to set the data format to Text, Date etc as appropriate. There are 24 columns and it's a bit fiddly to do it manually each time in the Import Wizard. Any suggestions? I'm using Excel 2003. Cheers!
There are 14 comments on this entry. (Reply.)
ext_16733: (Default)
posted by [identity profile] akicif.livejournal.com at 02:12pm on 11/09/2009
It's not on this machine - in fact, I think the machine it was on is long since dead - but there used to be a version of perl for Windows that spoke Office/Excel.

Ah. ActiveState: http://community.activestate.com/forum-topic/excel-files-with-activest
 
posted by [identity profile] bugshaw.livejournal.com at 02:24pm on 11/09/2009
Cripes, I'd have a job learning how to implement that!

It's a work machine, they don't give me authority to install anything, so all I have is Excel. Also I don't know Perl :-)
ext_16733: (Default)
posted by [identity profile] akicif.livejournal.com at 02:50pm on 11/09/2009
And it's probably hideously inefficient for a one-off task into the bargain....
 
posted by [identity profile] ladymoonray.livejournal.com at 02:24pm on 11/09/2009
I do this all the time with downloads from our Vodafone database. Record a macro (tools/macro/record new macro) in a separate file while you do the first one, assign it a key combo (I use CTRL Q because that doesn't do anything else) and keep the macro file open. Open each .csv and press CTRL Q, then magic! The only thing you can't do with that is change the file type; or at least I've never found a way.
 
posted by [identity profile] bugshaw.livejournal.com at 02:39pm on 11/09/2009
Ooh! I did my first macro! :-) It was fun, but I felt very selfconscious while clicking...

It doesn't do quite what I need. If I open the CSV in Excel then change the data types, I lose some information (mostly leading zeroes) that I can't automatically recover. Instead I create a new workbook and use Data | Import External Data which prompts for a source file then a Wizard lets you sort out the delimiters and data types. I can't seem to start a macro recording _during_ the import, so the formatting recording is linked to a particular file.

I suppose I could save the files I want to format as RunMacro.csv, process, save as what I want, delete RunMacro.csv, and save the next file as RunMacro. It would be quicker than the current process!

Thanks.
 
posted by [identity profile] ladymoonray.livejournal.com at 03:15pm on 11/09/2009
Ah yes, I see what you mean. I have no suggestions; I'm not actually very good at this at all, and I'm sure my macro has far too many steps.

Hope you can find a way to speed things up a little, anyway.

 
posted by [identity profile] errolwi.livejournal.com at 06:53pm on 11/09/2009
This might be a stretch for someone who has just done their first macro. However, if you can find the code that the macro recording generates,then you can manually change the name of the data file being opened. I think it's labelled macro|edit.
 
posted by [identity profile] johannes-d.livejournal.com at 02:33pm on 11/09/2009
Provided the data is correctly formatted for excel I find just opening csv files generally works satisfactorily.

test,2.5,"£67.8",25/2/52,"25/2/52"

the string above in a csv file opens in excel with the test left aligned as text, the number right aligned as a generalnumber, the £67.80 as currency and the two dates as dates.
 
posted by [identity profile] pseudomonas.livejournal.com at 02:42pm on 11/09/2009
This doesn't always work. Notoriously if you have a protein name such as the septation protein Sep7, Excel helpfully recognizes it as a date and autoformats it to 7/9/1901 or whatever the default date format is. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics by Zeeberg & al. describes the extremely annoying phenomenon in depth.
 
posted by [identity profile] bugshaw.livejournal.com at 02:47pm on 11/09/2009
Have list of nurses, not proteins (though nurses are made of proteins), but I will bear this in mind for Septicaemia courses if abbreviated! Wonder how nurses called Tuesday Next would be handled? If you reply to this, the comment notification in Google will offer to put it into my calendar...
 
posted by [identity profile] pseudomonas.livejournal.com at 02:54pm on 11/09/2009
A more common problem is probably when someone's put two alternatives in a numerical field, so "5/6" for "five or six" is interpreted as "fifth of June" or "May the sixth" depending on locale.
 
posted by [identity profile] bugshaw.livejournal.com at 02:44pm on 11/09/2009
Ah, they all send me things with leading zeroes which I should process as text but they're not in double quotes.
Opening 0213456, 07
gives me 213456 | 7
but "0213456","07"
should give me 0213456 | 7 ?

Darn, I'd assumed they'd been doing something sensible. Switching to quote-delimited CSVs will give problems elsewhere in the process.
 
posted by [identity profile] johannes-d.livejournal.com at 03:56pm on 11/09/2009
I'd probably import one file to a table in access. It has much better control of datatypes and formats. set all the data types as I wanted them. Then selecting the whole table and pasting it into excel will preserve the text fields with leading zeroes.

If you delete the data from this table it is a fairly quick process to import subsequent files and export them to excel
timill: (Default)
posted by [personal profile] timill at 03:37pm on 11/09/2009
I don't have Excel 2003 readily available to me, but...

If you create a dummy file where the first row has entries in the required formats, will importing the file into rows 2+ then adopt the existing formats? (expects answer 'no').

September

SunMonTueWedThuFriSat
  1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21 22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30