By Brendan Fitzpatrick
The topic of “which scripting language should I use for my project” is for beginners to intermediate Unix SA’s and scripters. I will not attempt to dive into details, this is a high level discussion that is generic to help you determine which scripting language to take for your project.
Unix command line scripting such as sed, awk, csh and ksh in my opinion are the most powerful tools to have in your skills, not because of the features they offer (their features are quite limited) but because of how fast you can process files on the fly vs. having to write a program with most other scripting languages. These tools work best when working on small to medium sized files with limited processing. I always opt for these tools when dealing with files with several 100 lines. By processing I mean reformatting files, merging them, comparing 2 fields etc. Much more than this on files larger than 500 rows starts to get into minutes of processing time.. It’s okay for a script to run for several minutes or hours if they don’t run frequently, (multiple times per day). The error handling with these tools are not advanced so you need to take that into consideration as well.
Perl is my favorite language (along with PHP for web scripting). Incredibly powerful and very fast on files with a few thousand lines with fairly complex processing – Perl usually does the trick for 90% of my tasks. Similarly you can use Python which is even more efficient and powerful. These languages are fairly easy to pick up as long as you have a sense of programming however once you start working with files with 10’s or 100’s of thousands of lines you may want to consider a database or C code. For these cases I highly recommend a database (mysql, sybase, oracle etc.), however just to import the data you may need to preprocess the data. If you have hours or days to waste then you can likely use Perl for your import tool however C code is often the best option but you need to have advanced coding skills when using C on very large files.