Load SQLite Tables from .csv files

| No Comments
Here is a quick script I wrote to load SQLite tables from CSV files.  I exported the files from Microsoft Access.  The first row has to have correct column names.

Text::xSV is the best CSV file handler I could find on CPAN.  It correctly handles files with embedded newlines as well as very large files.  SQL::Abstract makes it very easy to use straight DBI to speed up the load without going through an ORM. 

I prepare each row separately because I may have cases where there are trailing unused fields, and some Microsoft programs (Outlook comes to mind) do not attempt to output empty fields for them.  If the speed difference is important, you may want to try to guarantee the exact number of fields in each row so you can prepare once.

#!/usr/bin/perl

# load_csv.pl - Load a db table from a csv file.
#               First row must be field names.
# 04/28/2009  W. Ruppert

use strict;
use warnings;
use Text::xSV;
use DBI;
use SQL::Abstract;

$| = 1;

sub usage {
    my ($msg) = @_;
    warn "$msg\n" if $msg;
    die "Usage: load_csv.pl db table csvfile\n";
}

# get parms
my $db        = shift || usage "No database name";
my $table     = shift || usage "No table name";
my $data_file = shift || usage "No csv data file";

usage "No such file: $db"        unless -e $db;
usage "No such file: $data_file" unless -e $data_file;

# setup csv file
my $csv = new Text::xSV;
$csv->open_file("$data_file");
$csv->read_header();

my $sql = SQL::Abstract->new;

# connect to db
my $dbh = DBI->connect("dbi:SQLite:$db", "","",
        { RaiseError => 1, PrintError => 0 }
) or die "can't connect\n";

$dbh->do('begin');

my $max_commit  = 1000;
my $inserted    = 0;

# process csv rows
while (my %fieldvals = $csv->fetchrow_hash) {

    # SQL::Abstract sets up the DBI variables
    my($stmt, @bind) = $sql->insert($table, \%fieldvals);

    # insert the row
    my $sth = $dbh->prepare($stmt);
    $inserted += $sth->execute(@bind);

    # progress bar
    print "*"            unless $inserted % 10;
    print " $inserted\n" unless $inserted % 500;

    # commit every once in a while
    unless ($inserted % $max_commit) {
        $dbh->do('commit');
        $dbh->do('begin');
    }
}

$dbh->do('commit');
$dbh->disconnect;

print "\nInserted $inserted records\n";
exit 1;

Leave a comment

About this Entry

This page contains a single entry by Bill Ruppert published on April 28, 2009 4:28 PM.

Firefox FOUC fixed was the previous entry in this blog.

Promoting Perl is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38