Wednesday, July 08, 2009

Testing Redirection and Rewrite Rules with RSpec

One the websites that I maintain has quite a few interesting Redirect and RewriteRules in its Apache configuration. Up until yesterday, I was testing them with Selenium; today I'm testing with RSpec.

The website is for the Computer Science department at Calvin. Most of it is a Rails app for a CMS. However, there were portions of the old site that I wanted to leave in the hands of my former colleagues; there were other paths into the site that needed to be redirected.

Selenium, Not Such a Good Idea for This Problem

I really like Selenium for writing website tests in a browser, and I've seen how useful it can be for developing Rails apps. I applied it to this rewrite-and-redirect problem, and it was more than a bit of overkill. In my role as "website developer", I just needed to make sure the redirects were in place. With Selenium, though, I found myself testing other features, like content on pages or in directories that I wasn't really in charge of.

The Selenium tests also ran slow since it used Firefox itself to access the website.

I also found the tests hard to maintain. This is mostly on me because I saved and edited them in an HTML format. It would have been better to use Ruby. But if I was going to rewrite the tests in a format that was easier to use, why not also go for something faster and more targeted?

Ruby and RSpec

Ruby has libraries to access URLs through HTTP... core libraries. I can write expectations in RSpec. I can write expectations about the HTTP responses.

This was all theoretically possible until I came across "Test Drive mod_rewrite Rules with Test::Unit" by Patrick Reagan. Someone had done it already (as I figured) with Test::Unit. Instead of using Patrick's solution wholesale, I decided to tailor it more towards my needs, especially towards RSpec.

My mod_rewrite Expectations with RSpec

Here's what one of my expectations looks like:

it "should redirect curriculum pages" do
  get("/curriculum/bcs.php").should redirect_to("/p/bcs")
  get("/curriculum/bacs.php").should redirect_to("/p/bacs")
  get("/curriculum/bais.php").should redirect_to("/p/bais")
  get("/curriculum/bada.php").should redirect_to("/p/bada")
end

We had a few pages for the the different computing degrees at Calvin College: an ABET-accredited Bachelor's of Computer Science and plain BAs in Computer Science, Information Systems, and "Digital Arts". We definitely wanted all links to the old pages go to the new.

get(path) is my method:

def get(url)
  RedirectCheck.new(ResourcePath.new("http://cs.calvin.edu" + url))
end

Yes, I've hard-coded the server in this test. It keeps the rest of the code simpler, and it solves the problem I have now.

The two classes used here are based on classes of the same name by Patrick. You can see my versions in the GitHub repository. I mostly just simplified the code for my purposes.

The really important class is RedirectCheck. It accesses the URL, saves the response, and then parses it for queries like success? and redirect? and redirected_path.

success? allowed me to write expectations like this without any extra work:

it "should have a books activity" do
  get("/activities/books/").should be_success
  get("/a/books/").should be_success
end

redirect? and redirect_path make the redirect_to(path) matcher easy:

Spec::Matchers.define :redirect_to do |redirection|
  match do |redirect_check|
    redirect_check.redirected?.should == true
    redirect_check.redirected_path.should == redirection
  end
end

For some reason, redirect_check.should be_redirected triggers an undefined error for be_redirected.

redirection is the redirection path (i.e., the expected path). redirect_check is a RedirectCheck object created by get(path).

Performance

It takes about 3 seconds to run 15 examples, so it's not the speediest RSpec suite around (by a long shot). But these aren't tests to be run often. Plus, the original Selenium tests never ran that fast!

So, given the problem I was trying to solve, I really like this solution. It'll be easy to add more expectations if I need to. It's really easy to run them, too. I'll save Selenium for a problem that really requires it!

Monday, July 06, 2009

Converting from Test::Unit to RSpec in a Rails App

I'm going to write my process turning the Test::Unit tests into RSpec examples for another Rails app in realtime. The app is called "YAGS" (Yet Another Genetics Simulatr), for simulating the genetics of fruit flies for a college-level biology course. I did this transformation once before; this app is a little bit more than a CMS for my former CS department.

My post for the department CMS transformation was organized topically. I'm organizing this one chronologically.

Measure the State of the App

# unit tests
124 tests, 758 assertions, 0 failures, 0 errors
Test suite finished: 5.091451 seconds
# functional tests
164 tests, 1311 assertions, 0 failures, 0 errors
Test suite finished: 6.700129 seconds
# integration tests?! we don't need no stinkin' integration tests!

Install RSpec and RSpec-Rails in the App

While I'm at it, I think I'll also get Cucumber set up. (The last time in my "state of the app" is a lie; everyone needs integration tests!)

I drop these into my config/environments/test.rb file:

config.gem "rspec", :lib => false, :version => ">= 1.2.7"
config.gem "rspec-rails", :lib => false, :version => ">= 1.2.7"
config.gem "aslakhellesoy-cucumber", :lib => "cucumber", :version => ">= 0.3.11"
config.gem 'webrat'

These may be useful for a limited time only (especially getting cucumber from GitHub). I run these shell commands:

# I don't bother to install since I know they're installed
rake gems:unpack RAILS_ENV=test
rake gems:unpack:dependencies RAILS_ENV=test
./script/generate rspec
./script/generate cucumber

Question for Rails experts: with the gems tasks (like install and unpack), is it necessary or even desirable to add the gems to version control? Can't you have your deploy script do the installing and unpacking? I'd really appreciate feedback on this.

I run the spec and features tasks, and they both run without errors. Cucumber at least reports "0 scenarios, 0 steps"; RSpec is mute.

Copying Tests Over

mkdir spec/models
mkdir spec/controllers
mkdir spec/views
mkdir spec/fixtures
cp test/unit/*.rb spec/models/
cp test/functional/*.rb spec/controllers/
cp test/fixtures/*.yml spec/fixtures/

I have to make the directories because I haven't generated any examples for RSpec directly. When I finish with these steps, I discover why the spec task didn't seem to do anything: it didn't find any specs because the filenames of specs end in _spec.rb, not _test.rb. I wish I could give a command-line solution for this, but I haven't been able to find a way to get mmv (multiple mv) installed on my Mac. Coincidentally, I just discovered NameChanger today, and it does a very nice job at renaming the files.

I suppose this might work:

# run in spec/models, spec/controllers
for i in *_test.rb; do
  mv $i ${i%%_test.rb}_spec.rb
done

(After I finished the whole Test::Unit-to-RSpec transformation, I discovered that mmv is available in MacPorts! The name of the packge is mmv. I'm too lazy now to figure out what the right mmv command would be, but it's not to hard to figure out.)

Can't Find Test Helpers

no such file to load -- ./spec/controllers/../test_helper (LoadError)

That's because they're spec helpers now!

I use RubyMine's "Search > Replace in Path..." to replace the test_helper requires with spec_helper requires.

I know the next problem: helpers that I haven't copied over to spec_helper!

nokogiri Gem Acting Up

Two minutes later... I'm wrong, of course! I'm not having troubles with the helper methods yet. I'm having troubles with nokogiri and/or webrat. The basic error message is this:

no such file to load -- nokogiri/nokogiri

The spec task also spits out a lengthy stack trace and concludes that webrat isn't installed. Well, newsflash! webrat is installed, and so is nokogiri. But there isn't a lib/nokogiri/nokogiri.rb file anywhere that makes sense out of the error message. Even more frustrating, my other Rails app isn't having a problem with webrat and nokogiri

There is, however, a difference between the two apps: I do not have webrat and nokogiri in vendor/gems in the CMS app. They have been unpacked in the YAGS app.

rm -rf vendor/gems/webrat-0.4.4
rm -rf vendor/gems/nokogiri-1.3.2

Problem solved. I won't say I'm happy with the solution since it begs the question why it works, but webrat (and Cucumber) have been in enough flux lately that I don't worry if I have to do something out of the ordinary to get them to work for a while. As far as "out of the ordinary" goes, deleting the gems from vendor/gems isn't going to cause me to lose any sleep.

And I get my predicted error: methods defined in test_helper.rb that aren't in spec_helper.rb.

Adding to Spec helpers

I start moving methods over. One bad thing we did in test_helper.rb was to define the helper methods at the top level. So I'm putting them into modules, and then using config.extend and config.include for class and instance methods, respectively. (I believe I learned about this from the RSpec book. There should be documentation online for this as well.)

One thing I notice is that I'm not requiring webrat in spec_helper.rb, and I did require it explicitly in my CMS app. Requiring in the YAGS app doesn't revert me to the nokogiri/nokogiri error from before, so I'll leave the require in.

However, I get one method moved over, and suddenly I'm getting a complaint about shoulda's should_have_many.

Shoulda Gem with RSpec and Test::Unit

undefined method `should_have_many' for UserTest:Class

After fighting with this for a while, here's what I've figured out. The excellent shoulda gem is being loaded; the context method works just fine. So it's not a load path problem or other directory issue. Poking around in the code, shoulda.rb itself asks this question: defined? Spec and loads different libraries depending on what you pick. That is, you can have shoulda code for RSpec or Test::Unit, but not both!

Keep in mind that while I'm converting to RSpec, the tests themselves are still very much in a Test::Unit form. RSpec is supposed to handle them without too many problems.

Perhaps there are some games I could play with requiring files myself, but I'm not sure how compatible the RSpec and Test::Unit versions are, and I figure: why not start the true coversion to RSpec now? So I end up moving the should_have_many assertion to a new describe block:

describe User do
  it { should have_many :vials }
end

I also drop the underscore between should and have_many. (I'm fudging a bit on the chronology here because iI discovered this problem later.) I'm actually not sure if the describe block is completely necessary after this change, but since I want to move in that direction anyway, I'm going to keep going in that direction.

I keep this in the same file as the Test::Unit test case. Works fine, and so I go on to fix all of these shoulda problems.

should Is a Reserved Word

Well, not really, of course. Actually, the problem is that both shoulda and RSpec have it as a "reserved word", but the meaning is different. (I suppose this might be a definition of a Object- or Kernel-level method: a keyword whose meaning can be changed. It should (pardon the pun) be treated as a keyword, but it's meaning can be changed due to which class it's added to.)

All the old shoulda tests that use should "X" have to be turned into it "should X". Time for Replace in Path again!

There are only a few to change, and now finally the specs all run and 186 out of 288 fail! It appears I have more methods to move to spec_helper.

Smelly Code

Along the way, I discover that some of the helper methods are quite stinky. Or some of the tests that they "inspire" are quite stinky. The biggest stink comes from black-box testing the associations and their dependencies. That is, for example, we had tests that would use the fixture data to make sure the right vial had the right flies in it. And if we deleted that vial, the right flies would also go. shoulda and RSpec make this much easier:

it { should have_many(:flies).dependent(:destroy) }

It reads short and sweet, and it doesn't involve any data checking.

Finishing Is Only the Beginning

How poetic. Gag!

287 examples, 0 failures, 287 passed
Finished in 12.497872 seconds

If you look at where I started with my tests, there were 288 unit and functional tests. I've added some new ones, and deleted old ones, so 287 examples sounds pretty good to me.

I'm not finished with the conversion, of course. I've discovered quite a few unit tests for models that still have only their default "test true" stub! Also I plan to separate the controller and view examples; this I typically do when needed though.

I need one set of view examples soon: I "ignored" a assert_standard_layout helper method. This method did a lot of looking at the HTML generated by the standard layout. It gets called several times throughout the tests. This was always overkill, but it seemed like a good idea to me to make sure that each action was using the standard layout and wasn't screwing it up. RSpec allows one to separate out the rendering from the controlling, so for the RSpec examples, for the time being, I just wrote a version of the method that does nothing. I need to write targeted view examples that will make the same assertions about the standard layout and then not worry about the standard layout anywhere else in the examples.

Biggest Mistake

Realistically, this isn't a huge mistake, but it was sloppy: I forgot to create a git branch for all of this work; I've been doing it all on master! Consequently, I've been afraid to make commits along the way since I try to keep the master green on every commit (or merge).

Worse yet, I realized this mistake somewhat early on, and did nothing about it. I believe that there are ways to stash what you're working on, create a new branch, and use the stash on the new branch. (I believe it's even git-stash that let's you do this.) Even if that didn't work, would it have been so terrible to revert all of my changes, create a new branch, and start over from scratch? Even if I did that now, would it be that much work?

Friday, July 03, 2009

Empirical Evidence: Learn Haskell

I read this morning an article by Lera Boroditsky, "How Does Our Language Shape the Way We Think?". Very interesting read.

I've often wondered if I should have become a linguist instead of a computer scientist (or mathematician). One idea that's intrigued me is the Sapir-Whorf hypothesis. According to that Wikipedia article (as of today), the hypothesis is

the idea that the varying cultural concepts and categories inherent in different languages affect the cognitive classification of the experienced world in such a way that speakers of different languages think and behave differently because of it.

Boroditsky's article suggests that there is something to the Sapir-Whorf hypothesis. The Wikipedia article (which I only scanned), seems to agree. The lingering questions, though, are how does it work and to what degree does it apply?

I've been interested in the hypothesis when teaching a Programming Languages course. Students will often complain about having to learn Scheme, Haskell, or Prolog (when they've been raised on Java and C++). The recursion in these languages is often a serious shift for the students (especially when other instructors tell them that they don't get recursion). Haskell has the extra "hassle" of a type system that's unlike type systems they've seen before. And Prolog is completely declarative, and even I find the data flow tricky.

Too often, if students talk about learning a new language, they're really looking to learn an old language with a new syntax. For the record, these are all pretty much the same language: Java, C++, C, PHP, Perl, C#, etc. While learning a couple of these will certainly improve a resume, it won't really make you a better programmer, just better skilled.

Scheme and Haskell made me think differently, especially Haskell. I picked up a LISP book when I was in high school, and I actually understood most of it (without running any of the code). Learning Scheme in grad school made me realize how deficient that LISP book was. Haskell made me appreciate composing functions, type systems, and pattern matching. Prolog got me into a declarative state of mind.

It'll be interesting to see where languages like Ruby and Python take us. I'd label both of these as "object-oriented languages with strong functional-programming influences". Can programmers learn the FP features? Will they embrace them? Would learning Scheme or Haskell first help? Hurt?

Well, whatever you think about these larger issues, read Boroditsky's article.

Wednesday, July 01, 2009

Upgrading Rails 1.2 to 2.3

While I reel from the terrible error in my previous post, I thought I'd post about my recent experience updating a Rails app from version 1.2 to 2.3. (BTW, the error is Andy's last name. I'm still right about the anti-if campaign.)

I had a real fun time upgrading a Rails app for my department's website (now on GitHub), and by "real fun time" I mean "lots of pain and suffering". More recently (like this past March) I discovered this wonderful guide by Peter Marklund. So when I went to upgrade a second app (also on GitHub), I went through his process.

Let me say this up front: I'm quite sure that Peter's instructions will work for you. They're good instructions: they tell you the steps to take and the pitfalls to watch for. Whatever they don't tell you, Rails will tell you.

However, the process took me a couple days to complete. Somehow I broke Rack. Once I reached the step of running some rake tasks, I was told that Rack::Request was an undefined constant. Um... that's a Rails internal (sort of); I really don't think that's my responsibility!

Long story short: I had my own Rack model in the app! That Rack class was shadowing the Rack module that's now part of Rails. Yesterday I spend 70 minutes refactoring my app to rename my Rack (to Shelf, thanks for asking), and it's working fine (at least according to the tests).

Short story long: one of the keys in debugging this problem is that after being frustrated by the error for quite a while and trying to debug what was so terribly wrong with my load path, I decided to take smaller steps. Instead of going from version 1.2 to 2.3, why not upgrade to 2.1 first? That didn't work too well either, but I forget why now. Since it didn't cost much at that point, I tried to upgrade to 2.2. If that failed, I could try for 2.1 or even 2.0 for a smaller step; if it worked, then I wouldn't complain. Fortunately, the upgrade to 2.2 went fine.

I didn't immediately remember that Rack was added in 2.3, but that was a key insight when I finally realized the problem. I believe the tipping point came when I did a full text search of the whole project in RubyMine. Then all the references to the Rack model jumped out and screamed at me.

This makes me wonder what would have happened if I had looked over the app before upgrading. I had forgotten that the Rack model even existed. Would five minutes of scanning the models and controllers at the beginning have made this problem easier (even trivial) to debug?

Saturday, June 27, 2009

Anti-If Campaign

A few days ago, I tweeted this:

Only reading the tagline on the homepage, this seems like a campaign I can get behind: http://www.antiifcampaign.com/

This is actually a theme I've had in my head while teaching Programming Languages the last two years. (I say "in my head" because I was always coy about this, presenting the issues rather than making a blanket statement: ifs are evil!)

Andy Meneeley, a former student and friend of mine, responded:

@jdfrens Sounds intriguing, but they only have one (trivial!) code example, then the rest of the site is just promo: a typical Agile smell.

The slam against agile ("Agile", really? capitalized?) was to get my hackles raised. Well, raised they are! I tried composing a Twitter reply, but 140 characters doesn't do justice to my hackles.

Smackdown

Here's the thing: if you want a non-trivial example, if you want multiple examples, look at any object-oriented program which uses polymorphism! That's not using an if. I'm beginning to re-think my choice of introductory programming language; maybe we should go back to Pascal so that students can appreciate OOP!

Polymorphism just says: invoke this method, and the object will know what to do when the time comes. Ruby (and other dynamic OO languages) take this to an extreme: just call the method; no one's going to check on it until it's actually invoked. Java forces you to promise the behavior through interfaces, classes, and type checking, but when it comes down to me writing a polymorphic call, Java doesn't really care where that method is coming from. Just call it! What could be an easier way to program?!

When implementing the polymorphic methods, all you have to consider it a very restricted context. Again, what could be easier?

ifs are necessary to kick things off. Some code has to actually create the objects which will later make the polymorphic dispatches, but how much of your code consists of creating new objects? It could probably be delegated to a couple of factories if you really wanted to, and then the "if problem" would be localized.

If X < Y

One question does intrigue me: mathematics. My dissertation involved matrix operations, and I took a couple of scientific computation classes in grad school. Could I write a QR factorization without ifs? Could it be done in a more OO way? This I'm not sure of. Perhaps the more mathematical the domain, the more ifs?

Not Just OO

One thing that gets ignored in these arguments is functional programming. There are a couple ways that FP languages avoid ifs: case analysis, pattern matching, object-oriented extensions, and type classes. OO extensions are the least interesting in this list because, well, we're already talking about eliminating ifs with OOP. Case analysis is the "if problem", but a case analysis is possibly a misuse of a FP language. Pattern matching and type classes offer interesting solutions, but they're proving to tickle my interest more than I thought they would. I'll have to think about the FP solutions to the "if problem" and blog about it later.

During the meanwhile, all (and especially Andy) should contemplate the precedence and associativity of the phrase "former student and friend". (Too harsh? Too bitter?)

Thursday, June 04, 2009

Why haven't I wondered about this before?

I came across this article from InfoQ today: "Original Sin" (Would Java be Better Off Without Primitives?). I've been very impressed and happy with Ruby's commitment to making everything an object. I had been indoctrinated into ignoring primitive-versus-pointer distinctions by Scheme and Haskell, so to revert to Fortran/C thinking when I picked Java up was kind of annoying. My question after reading the article was why I assumed that Java had to have primitive data types.

There are some cool tidbits about Java in that article that make it worth reading. I'll spoil the coolest one: Java exists because Sun couldn't license Smalltalk. What could have been!

Friday, May 22, 2009

Name your variables, stop picking on the string class, and call methods symmetrically

Actually, the talk is "What the Ruby Craftsman Can Learn from the Smalltalk Master" by Philippe Hanrigou. In his talk, Philippe covers these three issues because they're interesting in themselves and also to demonstrate what others have discovered before us. The "others" in this case is Kent Beck and his Smalltalk Best Practice Patterns book.

Naming variables. Philippe discusses Beck's take on variables: name them after their roles. I couldn't agree more. I've ranted here about this before (haven't I?), but I'm convinced that a sign of a good programmer is the ability to find the right names for classes and variables. I'm almost convinced that this ability is sufficient and necessary to be a good programmer. (And, please, no one look at my CIAT project and some of the class names there. I'm really unhappy with them.)

It saddens me the number of students I've seen who'll use the variable names that first come to mind or are generated by Eclipse. (My Programming Languages students might want to look over their code for any arg0 variables that are being used in useful methods before I turn in grades next week.) I've found that a lot of the time that a good variable name is based on the data type. For example, I've been writing a lot of interpreters lately. When I have an integer object that needs to be interpreted, isn't integer an obvious name? It does bother me a bit that it's also the data type of the object, IntegerETIR ("integer expression tree-intermediate-representation"). I can see an argument that I should call the variable "expression" or even "program". Perhaps a good starting point is to use a variation on the data type for the variable name, and if that's too ambiguous or misleading, find a better name for the role.

Giving the string class too much responsiblity. Philippe points out all of the methods are being added to the String class for data conversions: to_f (meh), to_yaml (maybe...), to_date (yikes!), to_blob (what the...!), etc. This means to be able to use a string, you'll need all of these other classes.

The solution is to turn these "to" methods into "from" methods. Put from_string into Date, YAML, Float, and Blob classes. Haskell, too, uses "from" functions. My understanding is that this is more about the type system, but then I wonder if the type system is telling us something!

Symmetry. Philippe ends with an example of a method invoking three other methods. Here's an example of my own:

def foo()
  dog()
  @zoo.bat()
  cow()
end

It comes down to the readability of the code. Instead of figuring out what the code is doing, you're asking yourself, why the one instance variable? I'm reminded of poetry and music where the artist is supposed to follow a particular form. The second line of this method is like a limerick of six lines or a haiku with 20 syllables. "Why did you do that?" While this might be dramatic in a poem, this kind of drama is not wanted in code!