Converting csv to sql using php

In previous posts we’ve discussed getting your csv files into sql format using different methods. I walked you through the process from csv to sql using HeidiSQL and PHPMyAdmin. I then did a follow up on how to import a large csv file into a MySQL database using the MySQL command prompt. In all these instances, what we were essentially doing was creating queries, just in different ways.

Well, today I want to present you with another method of getting your csv file into sql, using PHP code. For this piece of code, the full credit goes to legend. You need to make sure the database is already created before you dump the date.

/********************************************************************************************/
/* Code at http://legend.ws/blog/tips-tricks/csv-php-mysql-import/
/* Edit the entries below to reflect the appropriate values
/********************************************************************************************/
$databasehost = “localhost”;
$databasename = “test”;
$databasetable = “sample”;
$databaseusername =”test”;
$databasepassword = “”;
$fieldseparator = “,”;
$lineseparator = “\n”;
$csvfile = “bbqrest.csv”;
/********************************************************************************************/
/* Would you like to add an ampty field at the beginning of these records?
/* This is useful if you have a table with the first field being an auto_increment integer
/* and the csv file does not have such as empty field before the records.
/* Set 1 for yes and 0 for no. ATTENTION: don’t set to 1 if you are not sure.
/* This can dump data in the wrong fields if this extra field does not exist in the table
/********************************************************************************************/
$addauto = 0;
/********************************************************************************************/
/* Would you like to save the mysql queries in a file? If yes set $save to 1.
/* Permission on the file should be set to 777. Either upload a sample file through ftp and
/* change the permissions, or execute at the prompt: touch output.sql && chmod 777 output.sql
/********************************************************************************************/
$save = 1;
$outputfile = “output.sql”;
/********************************************************************************************/

if(!file_exists($csvfile)) {
echo “File not found. Make sure you specified the correct path.\n”;
exit;
}

$file = fopen($csvfile,”r”);

if(!$file) {
echo “Error opening data file.\n”;
exit;
}

$size = filesize($csvfile);

if(!$size) {
echo “File is empty.\n”;
exit;
}

$csvcontent = fread($file,$size);

fclose($file);

$con = @mysql_connect($databasehost,$databaseusername,$databasepassword) or die(mysql_error());
@mysql_select_db($databasename) or die(mysql_error());

$lines = 0;
$queries = “”;
$linearray = array();

foreach(split($lineseparator,$csvcontent) as $line) {

$lines++;

$line = trim($line,” \t”);

$line = str_replace(“\r”,””,$line);

/**********************************************************************************************/
This line escapes the special character. remove it if entries are already escaped in the csv file
***********************************************************************************************/
$line = str_replace(“‘”,”\'”,$line);
/**********************************************************************************************/

$linearray = explode($fieldseparator,$line);

$linemysql = implode(“‘,'”,$linearray);

if($addauto)
$query = “insert into $databasetable values(”,’$linemysql’);”;
else
$query = “insert into $databasetable values(‘$linemysql’);”;

$queries .= $query . “\n”;

@mysql_query($query);
}

@mysql_close($con);

if($save) {

if(!is_writable($outputfile)) {
echo “File is not writable, check permissions.\n”;
}

else {
$file2 = fopen($outputfile,”w”);

if(!$file2) {
echo “Error writing to the output file.\n”;
}
else {
fwrite($file2,$queries);
fclose($file2);
}
}

}

echo “Found a total of $lines records in this csv file.\n”;

?>

So that is one nice easy way to do it, and the code is easy to follow and understand, so that you can mod and adapt it to your needs. Many thanks to legend for this code.

Importing large csv files into sql using the command prompt

I recently did a series of posts on how to convert a csv file into a sql file. You can see a summary here that will lead you to the other posts.

In that series I talked you through doing the import using HeidiSQL and then creating the sql file using phpMyAdmin, a process which is easy and straightforward to use. However, for those who like to use the MySQL command prompt, there is another way to go through this process, using the LOAD DATA INFILE command.

What you do is get to your MySQL command prompt and log in as root. Create the database and the tables using the same basic process I outlined before in my newbie tutorial about importing large sql dumps. In summary:

1. Create the database:

mysql> CREATE DATABASE database_name;

3. If this is done correctly with no typos, you will get a success statement to the effect:

Query OK, 1 row affected <0.00 sec>

4. Switch to the database you just created using the USE command.

mysql> USE directory_name;

You should see the text

Database Changed

Then create the table using the create table command:

CREATE TABLE “table_name”
(“column 1” “data_type_for_column_1”,
“column 2” “data_type_for_column_2”,
… )

Remember that you want your columns to correspond to the fields and data types in your csv file.

Once the database and table are created, you are ready to import your csv file. Remember that you can use this same process for any other kind of delimited file, whether it uses commas, semicolons, or other types of delimiters. To import the csv or other delimited file into MySQL, use the LOAD DATA INFILE command. This command has several options and things you need to consider. Here is the basic syntax:

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE ‘file_name’
[REPLACE | IGNORE]
INTO TABLE tbl_name
[CHARACTER SET charset_name]
[FIELDS
[TERMINATED BY ‘string’]
[[OPTIONALLY] ENCLOSED BY ‘char’]
[ESCAPED BY ‘char’]
]
[LINES
[STARTING BY ‘string’]
[TERMINATED BY ‘string’]
]
[IGNORE number LINES]
[(col_name_or_user_var,…)]
[SET col_name = expr,…]

This may look a little confusing, but it really isn’t. The stuff in square brackets [] is your options that are determined by your source and your data, and a lot of them may be optional and unnecessary. If you study the code carefully you will recognize a lot of the same options we encountered when using HeidiSQL to import the file, for example

[FIELDS
[TERMINATED BY ‘string’]
[[OPTIONALLY] ENCLOSED BY ‘char’]
[ESCAPED BY ‘char’]
]

for the example file that we used in the previous tutorials would be:

[FIELDS
[TERMINATED BY ‘,’]
[[OPTIONALLY] ENCLOSED BY ‘””‘]
[ESCAPED BY ‘/n’]
]

Does this look familiar? See if you can redo the process of importing the csv file that we went through in the tutorial but this time using the MySQL command line. This is a useful exercise because you may be faced with a situation where the command line is all you have. So have a go at it!

I recommend that you read up on the syntax of the LOAD DATA INFILE command at http://dev.mysql.com/doc/refman/5.1/en/load-data.html. This is for v.5.1, so if you are using a different version, please check to make sure the syntax is correct.

If you have any difficulties, questions, comments, or have noticed some errors in this tutorial please do leave a comment. I will greatly appreciate it!

Happy Coding!

Finding and Deleting Duplicates in Excel

Sometimes when working with large excel spreadsheets, you run into the problem of duplicates, and it helps to be able to find them and delete them without having to manually go through the spreadsheet. If you have two or more columns of data, and you need to find and delete the duplicates, this is one way to do it. Say you have all your data in columns A and B:

  1. Create a column that concatenates the information in A and B. You can either use the “CONCATENATE” function or use the ampersand (&) by entering the formula A1&B1 to cell C1. Remember, as with all formulae, you have to use the = sign before the formula, thus =A1&B1.
  2. Copy and paste the formula down the column to the end of your data. For example, if your data runs from rows 1 to 120, copy and paste your data down to cell C120. An easy way to do this is, in Windows, select from the cell where your formula is to the last cell, and press Ctrl+D, or Edit -> Fill Down. For a mac, I’m not sure there’s a short-cut, except using Edit -> Fill Down.
  3. The next step is to find and mark the duplicate entries. Go to column D, and in the first cell enter the formula enter the formula =IF(COUNTIF($C$1:C1,C1)>1,”Duplicate”,”Unique”
  4. Copy and paste this formula down the column to the end of your data using the same process as in step 2.
  5. This column now shows you what rows have duplicates and which have uniques. You now want to delete, or do something else to just the duplicates, so you want to sort your spreadsheet by duplicates and uniques. You can’t do this with your spreadsheet as it is now because of the formula, so copy all the data to a new worksheet, but you only want to paste values, not the formulas, so that you can sort them by column C. Use Edit -> Paste Special -> Values only.
  6. Sort your spreadsheet using Column C, so that you all the duplicates are at the end of your spreadsheet, and once you delete them, or do whatever you want to with them, you can then delete column C.

This leaves you with only unique entries in your worksheet. You can extend this method to multiple columns for more complex spreadsheets and it should work fine.

Happy Coding!

From csv to sql – in brief

In my last few posts I explained in a four-part tutorial how to convert a csv file to a sql file. I am going to summarize the steps in this post so that you have them all in one reference post. The tools that I prefer for this process are PHPMyAdmin and HeidiSQL, and of course your csv file. The assumption is that you have Apache, PHP, and MySQL all running on your machine, and you can log in as root admin.

  1. Create a mySQL database and then create a table that corresponds to your csv file. Make sure that the fields correspond, and I recommend that you create an autoincrement unique ID field as your primary field, even if it doesn’t exist in your csv file. Read more on Creating the Database.
  2. Using HeidiSQL, import your csv file into the database table that you have created, making sure to deselect the ID field if it doesn’t exist in your csv file. Read more on Importing the csv file.
  3. Look through your newly populated database table to ensure that the data has all been imported correctly and that you have the correct number of entries. The best tool for this is PHPMyAdmin.
  4. Finally, create the sql dump file using the export to sql command in either PHPMyAdmin or HeidiSQL. Make sure to select the correct exporting options depending on whether your sql dump file will create the database and populate it or will be updating an already existing database. Read more on Creating the sql dump file.

I hope that you have found the series of tutorials helpful, and that you are more comfortable working with all the database tools that I have been talking about.

Happy Coding!