Impressum / Imprint

Saxon government's press releases now powered by JRuby on Rails

Posted on April 07, 2008

Medienservice Sachsen Last week, the Medienservice, the platform via which the saxon government publishes its press releases to journalists and to the public, has been relaunched. It now runs on a cluster of JBoss servers that are part of the official saxon e-government platform. While the public web frontend might look like just another Blog-like application to you, I assure you that the stuff that happens in the background is anything but simple - there’s a lot of stuff going on like deferred publishing, publishing press releases only to subscribed journalists, and sending out press releases in four different formats including PDF and XML, to only name a few.

As far as I know this is the first public german JRuby on Rails application - one more reason for me to be proud of being part of the team at webit! that built this baby.

RDig 0.3.5

Posted on February 26, 2008

RDig is a tiny web and file system crawler built on top of the Ferret search engine. It’s one of my less active side projects and from what I can tell doesn’t have a very large user base. However there are some people out there who actually use it, and some of those people even tell me so and suggest new features from time to time :-)

Limit crawling depth

You can now configure a maximum crawling depth to restrict RDig to only index pages up to this level. For example, setting config.crawler.max_depth = 1 will make RDig only index the configured start pages, and pages the start pages directly link to. You get the picture I guess.

This option is especially useful if restricting RDig to a pre-defined number of hosts is not an option for your use case, but you still don’t intend to have it crawl the whole web.

HTTP proxy auth support

If you are behind a proxy and have to use HTTP Basic Authentication with it to get through, you can specify proxy url, user name and password:

cfg.crawler.http_proxy = "http://yourproxy:8080"
cfg.crawler.http_proxy_user = "username"
cfg.crawler.http_proxy_pass = "secret"

Under the hood

I put some work into refactoring parts of RDig in order to make integration with acts_as_ferret easier. I’ll write more about that in another post.

Get it!

RDig is available as a gem via Rubyforge.

Regexps on steroids with Ruby 1.8.x

Posted on January 27, 2008

Ruby 1.9 comes with a new powerful regular expression engine called Oniguruma. It sports better handling of UTF8 encoded content, plus goodies like positive and negative look-behind or named matches. Here’s a good overview about these and some more of the new features of Oniguruma.

There are two ways to get Oniguruma into a pre-1.9 Ruby: You can patch the Ruby source tree with Oniguruma and build your own Ruby, or use the Oniguruma gem, which makes it fairly easy to use the new style regular expressions in any Ruby 1.8.x project. Here’s how:

$ wget http://www.geocities.jp/kosako3/oniguruma/archive/onig-4.7.1.tar.gz
$ tar xzf onig-4.7.1.tar.gz
$ cd onig-4.7.1
$ ./configure --prefix=/usr
$ make
$ sudo make install
$ sudo gem install oniguruma

Note the prefix argument in the call to configure - it should point to the location of your current ruby installation. So if your ruby executable is located in /usr/bin, you’ll have to use /usr here as shown above.

If everything went well so far, try it out in irb:

require 'rubygems'
require 'oniguruma'
reg = Oniguruma::ORegexp.new '(?.*)(a)(?.*)'
match = reg.match( 'terraforming' )
puts match[0]         <= 'terraforming'
puts match[:before]   <= 'terr'
puts match[:after]    <= 'forming'

The downside of not having Oniguruma patched into a self-compiled version of Ruby is that something like

'terraforming' =~ /(?.)(a)(?.)/
won’t work because it will be handled by your Ruby version’s built in regexp rengine.

Encrypted root and swap plus suspend to disk with Gutsy

Posted on January 24, 2008

In order to give my trusted T42 a slight speed up I decided to replace the built-in 5400rpm hdd with something faster. I decided to go with Seagate’s ST910021A which seems to be a great choice from what I can tell so far - it’s noticeably faster and despite it’s 7200 rpm it’s nearly as quiet as the original 80GB disk from Hitachi.

But I digress. Initially I just wanted to copy over all the stuff and be done with it, but then I took the chance to do a fresh install of Ubuntu so I could try out the hard disk encryption setup that has been introduced in the alternate installer of 7.10. Until then I only had an encrypted /home, which was pretty useless since most of the time my notebook isn’t shut down but hibernated, and I never needed to type in my passphrase upon resume…

Grails project begging for attention

Posted on January 14, 2008

Sorry, but I can’t think of any other reason why Graeme Rocher might write such crap.

Among the points he makes when trying to convince his readers why they should choose Grails over Rails, there are at most two or three which are somewhat reasonable, namely those dealing with integrating your application with external J2EE based services. I completely agree that these are valid points when comparing Grails running inside a fully-fledged J2EE container to Rails running in, say, Mongrel. But since Rails runs fine in a J2EE environment as well, that’s an unfair and misleading comparison.

Using container-managed database connections via JNDI in a JRuby on Rails web app is no problem at all, neither is using Quartz to schedule Rails background jobs, just to name a few examples. I don’t see anything stopping people from using the whole range of J2EE container features in their JRuby/Rails applications once they need to do so.

The whole ‘Grails is more enterprisey than Rails’ argumentation falls apart once you stop comparing apples to oranges, and slap JRuby + Rails onto that damn app server.

Grails 1.0 coming out within the month

Uh cool, yeah. Until now I thought statements like this were more a specialty of closed source vendors trying to convince their potential clients not to check out the competition. Looks like they can’t wait to finally attach that decision maker friendly 1.0 label to Grails ;-)

Anyway, I feel we’re all going to have an interesting time in the future watching how the competition between JRuby/Rails and Groovy/Grails goes on. After all, competition tends to lead to better products in the end.

Job scheduling with JRuby and Rails

Posted on January 12, 2008

As promised earlier, here’s the first of several articles I’m planning to write about running Rails on JRuby. Originially I wanted to start this little series with some kind of ‘getting started with JRuby on Rails’ guide. Since I didn’t find the time (or, say, motivation) to write one for weeks now, I decided to skip right through to some more advanced topics. So for this post, I’m assuming you already got your hello world JRuby on Rails project up and running and deployed to the application server of your choice. If this is not the case, have a look in the Wiki for documentation about getting started. It’s also worth following the various relevant blogs, as well as the jruby and jruby-extras mailing lists.

Ok, this post is titled Job scheduling with JRuby and Rails for a reason, so let’s get started with this now.

While Rails itself nowadays runs quite flawless inside application servers like JBoss or Glassfish, the usual way to handle background job (push it to some external daemon) doesn’t fit a J2EE application server environment particularly well. Besides the fact that I couldn’t think of an easy way to get BackgrounDrb running from my application’s WAR file, it simply doesn’t feel right to have any extra daemons like BackgrounDRb running besides that fat application server.

As it turns out, there are several ways to do better.

The GoldSpike solution

The kind folks from the jruby-extras team provide two servlets, RailsTaskServlet, and RailsPeriodicalTaskServlet as part of the GoldSpike plugin. These servlets can be used to run arbitrary Ruby code inside the context of your Rails application either once or periodically every n seconds. To schedule a job running YourJobClass.do_stuff every minute, you would place the following declaration into your web.xml:

<servlet>
  <servlet-name>periodicalTask</servlet-name>
  <servlet-class>org.jruby.webapp.RailsPeriodicalTaskServlet</servlet-class>
  <load-on-startup>1</load-on-startup>
  <init-param>
    <param-name>interval</param-name>
    <param-value>60</param-value>
    <param-name>script</param-name>
    <param-value>YourJobClass.do_stuff</param-value>
  </init-param>
</servlet>

Oh the joy of XML configuration files ;-)

While this works great, I had to run a job not every some seconds, but once a week on a defined day and time. Besides that, declaring a separate servlet for each single background job seems like overkill. Back from my Java days I knew that the Quartz library provided exactly what I needed - support for cron patterns. So the challenge was to get Quartz run my Ruby script inside the context of my application.

The rails_quartz plugin

Our target platform was JBoss, which already includes the Quartz library. I’m not sure about other app servers, if yours doesn’t have this already, just download quartz and put the jar including any dependencies somewhere in your application so it ends up inside the WEB-INF/lib folder of your WAR file. With Warbler, which is my preferred way to package JRuby on Rails applications, RAILS_ROOT/lib is good place, since it will pick up any jars from there automatically.

How does it work?

The plugin provides a ContextListener, which, when declared in web.xml, looks for any job declarations and tells the Quartz Scheduler about them. Here’s an example web.xml snippet scheduling a job to run every friday at 10 am:

<context-param>
  <param-name>yourJobCommand</param-name>
  <param-value>YourJobClass.do_stuff</param-value>
</context-param>
<context-param>
  <param-name>yourJobCronPattern</param-name>
  <param-value>0 0 10 ? * 6</param-value>
</context-param>

<listener>
  <listener-class>org.jruby.webapp.quartz.QuartzContextListener</listener-class>
</listener>

As you see, I use context parameters to configure the command to run, and the cron pattern to use. You can declare any number of jobs you want, just stick to the naming scheme for the parameters: <jobName>Command and <jobName>CronPattern so the listener can find out which pattern belongs to which job.

You can get the plugin here: https://projects.jkraemer.net/svn/plugins/jruby/quartz_rails/ . As always, any feedback is welcome.

Strange Mongrel 1.1 error (solved)

Posted on January 07, 2008

Right after updating Mongrel, gem_plugin and some other (probably unrelated) gems today, Mongrel didn’t like me any more, refusing to start up any Rail s application. Instead, I got this nice message complaining about a missing init.rb in the activerdf gem:

** Rails loaded.
** Loading any Rails specific GemPlugins
Exiting
/usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:27:in 'gem_original_require': no such file to load -- /usr/lib/ruby/gems/1.8/gems/activerdf-1.6.1/lib/activerdf/init.rb (MissingSourceFile)

It took me a while to find the fix on rubyforge via google, so maybe this helps somebody else having the same problem.

acts_as_ferret 0.4.3

Posted on November 18, 2007

Long time since the last release (not counting the short-lived 0.4.2 …), and I guess most people already use trunk anyway, but for the faint of heart, here’s the new stable version of your favourite Rails fulltext search plugin.

As always, get it via svn from svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret. More installation information can be found on the acts_as_ferret Trac site.

No big news feature-wise, I already wrote about the more important features when I added them to trunk:

Going through the timeline looking for some cool feature I didn’t already write about I found several smaller things worth mentioning:

Dynamic document specific boosts

This comes in handy if you want to have search results automatically ranked by a criteria which is different for each record, e.g. the popularity of an article in your shop:


class Article
  acts_as_ferret :boost => :popularity
  def popularity
    # return dynamic boost value for this document
  end
end

You may also apply the dynamic boost to a specific field (or even different boosts to different fields), so it only is applied when a hit occurs in the boosted field. This way you can choose at query time if you want to have the boosting applied or not. Just query either the boosted fields, or the normal ones:


class Article
  acts_as_ferret :fields => { 
                             :title               => {}, 
                             :boosted_title => { :boost => :rating } 
                         }
  def rating
    # return rating of this article
  end

  # value for the boosted title field
  def boosted_title
    title
  end
end

New and better start/stop scripts

The DRb server now has a unified start/stop script and it ships with scripts for using the it as a Windows system service. Thanks to Peter Jones and Herryanto Siatono for contributing these.

Also the acts_as_ferret gem now has got an installer that will install the server script and sample config into your Rails project:


$ gem install acts_as_ferret
$ rails test
$ cd test/
$ aaf_install
$ script/ferret_server -e production start

And your DRb server is up and running. Easy, isn’t it?

No more :remote => true

Last but not least, aaf now is a bit more clever and goes into remote mode automatically if the DRb server is configured for the current environment. If for whatever reason you don’t want that, use :remote => false.

How I learned to stop worrying and love JRuby on Rails

Posted on October 31, 2007

As I wrote earlier, I got pretty excited about JRuby and especially about the idea of running Rails on top of it at RailsConf Europe. By pure coincidence we had just had started a project at webit! at this time which had to be deployed to the customer’s J2EE infrastructure. I didn’t really follow the JRuby development before, and therefore greatly underestimated its level of maturity. So, looking for a Rails-like way to build this application, I decided to go with the Groovy-based Grails framework.

At first Grails looks and feels much like Rails, however if you look closer and actually try to build a real application with it, the differences start showing up. Don’t get me wrong, when compared to the more traditional J2EE ways to build web applications (my experiences in this field range from plain Servlets to Struts and Spring), Grails is a huge step into into the right direction. Unfortunately for me, coming from Rails, it just didn’t feel right or complete in many places. Partly this is for sure because Grails, despite its name, is not just a Rails clone, but does things in its own way in many places. Without going into much more detail here, things often weren’t working the way I expected, and documentation was often outdated or missing.

Another issue with Grails for me was Hibernate, its persistence layer of choice. I simply can’t get used to the way Hibernate works. In my opinion it abstracts way too much from the database with its own query language and all its object oriented query building glory. Also it seemed to queue up sql statements and execute them at will at some later time, which I found irritating to say the least. I really don’t understand why Hibernate is the persistence framework of choice in so many J2EE projects. On the other hand it fits the ugly picture of the bloated J2EE web application quite well ;-)

To summarize this rant, don’t underestimate the learning curve of Grails, which will be even steeper when you aren’t already used to Hibernate. I think Rails people having to do J2EE development are just not the target audience of Grails. But it might be a good fit for J2EE developers who either already have their Hibernate models in place or at least have the will to invest some serious time in learning Hibernate.

After all, as you might already have guessed, I decided to start from scratch with JRuby and Rails after RailsConf Europe. And it really felt like coming home from a long and exhausting trip. Despite the lost time in the beginning we met our deadline and had a really great time solving problems the Rails way and deploying to JBoss every now and then. Right now I’m in the middle of my second JRuby on Rails project, same customer, same target platform. I plan to write some more articles about our experiences with JRuby, so stay tuned.

Keep an eye on your DRb server with Monit

Posted on October 21, 2007

Many people nowadays seem to use monit to ensure their Rails application is always up and running, and maybe even to get notified in case of any problems like unusual high load or memory usage.

Since acts_as_ferret doesn’t really like it when the DRb server has gone away, it’s a good idea to not only monitor your Mongrels, but also the DRb server itself. So here’s for you a small snippet of monit configuration derived from one I’m using elsewhere:

# monit configuration snippet to watch the Ferret DRb server shipped with
# acts_as_ferret
check process ferret with pidfile /path/to/ferret.pid

    # username is the user the drb server should be running as (It's good practice
    # to run such services as a non-privileged user)
    start program = "/bin/su -c 'cd /path/to/your/app/current/ && RAILS_ENV=production script/ferret_start' username"
    stop program = "/bin/su -c 'cd /path/to/your/app/current/ && RAILS_ENV=production script/ferret_stop' username"

    # cpu usage boundaries
    if cpu > 60% for 2 cycles then alert
    if cpu > 90% for 5 cycles then restart

    # memory usage varies with index size and usage scenarios, so check how
    # much memory your DRb server uses up usually and add some spare to that
    # before enabling this rule:
    # if totalmem > 50.0 MB for 5 cycles then restart

    # adjust port numbers according to your setup:
    if failed port 9010 then alert
    if failed port 9010 for 2 cycles then restart
    group ferret

As you can see it’s pretty straightforward, well, maybe except the start/stop commands which took me a few iterations to get right. I also added this to the acts_as_ferret distribution: monit-example.