Installing RWeka Package in R on OS X

I’m currently taking a data mining class, and I’m trying to do as much of the coding in the class using R for practice. We are using Weka 3 which so far is proving to be a pretty neat (FREE) data mining tool.  To meet my goal of improving my R skills while taking this class I needed to find a way to read .arff data files from Weka into R so that I can access the data to run the assigned tasks.

The first option, which is what most of my classmates are doing, is of course is to load the data into Weka and convert it to a CSV file and then load that file into R using the read.csv() command. I was curious to see if I could do it more directly.

Continue reading “Installing RWeka Package in R on OS X”

Win a Copy of the MODx Web Development book and more – Blog Contest

Welcome to the first ever Coding Pad Blog Contest!! We’re giving away some stuff – so read on!

What’s the occasion? – None really, just thought it would be cool to celebrate the  MODx Content Management Framework/System

What’s my goal? – To get more exposure for MODx and to get more people interested in MODx and it’s power and flexibility.  And ofcourse, to invite more interaction on the Coding Pad 🙂

What’s up for grabs?

  • A copy of the MODx Web Development book by Antonio Solar John.  This is the first ever MODx book in the English language.  I currently have three copies to give away to three winners, but hope to get more in the coming days.
  • In addition to the book, the top winning entry will also get a namecheap account with $20 credit (you can use this for purchasing domain names) and a three month membership to the tuts+ network!
  • In addition to the book, the second winning entry will get a three month membership to the tuts+ network.
  • In addition to the book, the third winning entry will get an opportunity for some free MODx consulting from modxguru.com‘s Shane Sponagle, an experienced MODx Developer.  You get to choose from one of two consulting options: an analysis of your MODx website with a report on how to improve your website, or 2 hours of email support consulting.  You will have three months from the day the contest ends to utilize the consultation.
  • The fourth and fifth winning entries will each get a one month membership to the tuts+ network

I anticipate that I will have more prizes to offer to the winning entries, and to allow more winning entries.

Continue reading “Win a Copy of the MODx Web Development book and more – Blog Contest”

A delimiter with non-delimiter uses during data import in Excel

I ran into an interesting situation the other day when trying to import some data into Excel. I had a text file with words and definitions that I needed to have in two columns in Excel, one for the word and one for the definition.

The words and their definitions were separated by a hyphen (-), so I could have gone ahead and done a direct import and specified the hyphen as the delimiter. The tricky part was that the hyphen appeared in some of the definitions too, and so using it as a delimiter would have split some of the definitions into separate columns, and I would have had to go through the sheet and reunite the definitions. Given that I had over 10,000 entries, this was a less than optimal solution, and not very time efficient.

I work a lot with Excel and Excel formulas, but I am by no means an Excel guru, and I am sure there’s a pretty simple solution, maybe a macro or something, for this kind of situation. But I used a formula to solve my problem as explained below:

What I did was paste the contents of my text file into one column with no separation between word and definition, with each entry in it’s own row. I then wanted to split the column into two, one with the word and one with the definition. I then used the right, left, len, and find functions. So with my data in column A, I put the following formulas columns B and C respectively and filled down.

=LEFT(A1,FIND(“-“,A1)-1)
=RIGHT(A1,LEN(A1)-FIND(“-“,A1))

This worked perfectly, and solved my problem in seconds. Do you have any suggestions for a different way to do this?

Importing large CSV files into Excel Using a Macro

In response to my post on importing large files into Excel by first splitting them, one of my readers, JP pointed out to me that you can bypass the splitting step by using a VBA macro to do the import.

As you may know, Excel has a cut off of 65,536 rows, and so if you want to import a csv or text file that has more rows than that, you’ll run into trouble. This is where the csv splitter I mentioned before, or this macro that JP pointed out to me will come in handy.

You can find the knowledge base article with the macro code, for a VBA solution to importing large files into Excel at http://support.microsoft.com/kb/120596.

And be sure to visit JP’s excellent website and blog VBA Code for Excel and Outlook where you’ll find a lot of useful macros and articles.

Large csv file? Download the CSV Splitter

If you have a large csv file that you have tried to open in Excel, you know how troublesome that can be, because Excel is limited in the number of rows and columns of data it can handle – 65,536 rows of data and 256 columns per worksheet. Truncation of rows or columns in excess of the limit is automatic and is not configurable.

I discussed this problem before in the post Splitting large csv files – the CSV Splitter where I introduced you to Scorpion’s nifty little program, the CSV splitter, which takes large csv files and splits them into separate smaller files and you decide how many rows you want each file to be. Previously, as you will see if you read my post on Splitting large csv files, you had to register on the forums to be able to download it. Scorpion just updated me that he has now provided a direct download link so that you no longer have to register but can download the csv splitter directly.

You can now download the csv splitter directly here.

The program is easy to use and is a lifesaver if you are like me and frequently work with large csv files. You can Splitting large csv files – the CSV Splitter read more about the csv splitter, and please do leave a thank you for Scorpion.