From the outset of this project I wanted to ensure that the program would run on a base install of Perl. This however meant not using any of the pre-built modules that Perl is famous for. In fact taking this route is slightly going against "The Perl Way", but I decided that this program should be "out of the box", and shouldn't need to have additional modules installed from CPAN.
As a result of this I ended up learning a lot more about Perl than I would have otherwise. This not only includes the language itself, but how Perl deals with variables and data structures internally.
Once I began designing the application it quickly became apparent I would need very tightly defined subprocedures for each command the program would execute. For instance, if a user entered an UPDATE statement I needed to be completely sure that the syntax would be completely validated. I then needed o ensure that the file wasn't already open by the program (otherwise the program would be closing the file, and opening it again). I also needed to ensure the integrity of the data being processed and so had to use locks to make sure the program wouldn't open a file unless the file was unopened by another process. Once this stage was reached I needed to make sure that the expression in the WHERE portion of the statement contained a valid column name, and finally I needed to make sure that while the UPDATE was changing a value in the data structure that it had no impact on any other values. or could perform any operations which might jeopradise the stability of the data structure.
I soon realised that simple SQL comands wouldn't be sufficient and that users would need to do more in order to make CsvSQL usable in a corporate environment. For this reason I added in several non-SQL commands. These commands are:
open - Opens a file and reads it's contents into memory without the need to first select it.
close - Closes a file and releases all locks. This means the user can access multiple files in one session.
describe - Reads the first line of a file allowing users to "peek" at a file to check if it contains the information they are looking for, without having to read the entire file into memory.
set pipe - Sets the output of the program to both print to the screen and to append to a specified output file. This will allow the user to create a new CSV from specified sections of any amount of other CSV files.
ls / dir - Lists either the current directory contents, or that of a specified path. By running this command a user can find a file they are looking for without ending the current session
clear / cls - Clears the screen. As CSV files are quite large they take up a lot of screen real estate when printed and at times it's easier to analyse data when it is represented on it's own without previous selections in view.
version - prints the version number of CsvSQL
dump - Will print a dump of the data structure in memory. While originally I had only intended to use this command for testing, and was planning on removing it before release, I left it in for it's interest value. Also, at a later stage should a user come across a bug in the program where it doesn't perform as expected I can request a dump of the data (should it not be sensitive), along with the original CSV and command issued in order to debug the problem and release a patch for the program.
In structuring the actual code of the program I broke the code into sections which contained groups of subprocedures which had similar functions. This enabled me to more quickly debug an error as all code of a similar function was in the same place.
It also meant that there were defined steps that a command would follow in order to execute successfully, and definitions of what should occur should a command be rejected for any reason.