Wednesday, 29 July 2009

How Are We Doing?

I had previous posted about our first day's figures. Since the rush caused by the MacInTouch launch, and then a spike when we were on the front page of Apple's download site, things have settled down. I've been busy fixing a couple of key bugs, and enhancing Rococoa in time for Snow Leopard, so there has been very little publicity, and I can now report on our steady-state traffic. These figures are not precise, sometimes they are just gleaned from looking at the Analytics graphs and guestimating, but they are figures, based largely on the 7 days beginning 22 July.

Google Analytics shows that our steady state is 50 visitors a day, with 7 downloads of the application. Looking at the server logs though, Analytics misses many downloads. 10 a day are referred from Apple downloads, 5 a day from MacUpdate, and 17 a day from our own downloads page. I don't know why Analytics misses those last, a very few are people retrying downloads, but most just seem to slip the JavaScript net. In total I think that we get 33 downloads a day.

Analytics also reveals that 40% of our traffic is direct, 10% comes from people searching for the term 'velocraptor', and our bounce rate is 44%. From this I'm forced to conclude that 44% of our traffic is people either typing 'velocraptor' into the location bar and finding us rather than dinosaurs, or a similar effect with Google's 'I'm Feeling Lucky'. If we discount all the bounces as people who should just learn to spell, then our real steady state is 33 visitors a day - this matches the sum of the pukka search terms and referrals, but does not include those people who download the app without touching the html.

Spookily then, our ratio of visits to downloads is 1, although these aren't all the same people! This blows the industry average of 28% to pieces, but it isn't all good news - my next post will cover registrations.

Tuesday, 21 July 2009

Is VelOCRaptor Good Enough?

From UserVoice - "I agree with Dyno wholeheartedly. Not to sound discouraging but it's really of no use in its current state. Even using the crispest font, it doesn't recognise half the number characters. And the PDF output is blurry. It's actually rather cheeky getting users to en masse as beta testers in this way. (More dubious practices to follow, no doubt)."

I don't mean to be defensive, but to say that VelOCRaptor is no use in its current state, and to accuse us of dubious practice by releasing it is a bit harsh. If you don't like the product, by all means don't use it, but please don't question our motives.

I would obviously like for the accuracy to be better, but 1.05 developers are not going to develop a world-beating OCR engine. The companies that have developed OCR engines are charging you $125 (FineReader) to $499 (OmniPage) for them, and their integration and usability is quite frankly substandard. I've tried to licence a world-class engine, but the company won't risk letting the technology ship in a product with the features and price that define VelOCRaptor.

So the OCRopus engine is the best that you can buy for under $100. I wish it was better, but it isn't, yet. I thought long and hard about whether to ship with the current engine and I came to the conclusion that it was better than nothing - which is after all the alternative at this price. We state up front that the accuracy isn't great, and we post an example showing its performance. We've released it as it is because to many people it is good enough to produce searchable PDFs and grab occasional text - would the world be better off if we hadn't?

Some people are delighted with VelOCRaptor, others disappointed, but we're not forcing anyone to buy it, and we can hardly be accused of misrepresenting the performance. Releasing a product is hard work and costly - I've had no income for 6 months now. If we don't charge money then we can't tell if there is a market - and we need to know that there is a market if we are to continue development, adding features that users are asking for, and pulling in new OCRopus releases so that it delights more people.

So whilst I apologise for the lack of performance, I am unapologetic about releasing VelOCRaptor. By releasing early and often we get the chance to see if this proto-bird can fly, and users have something that may be of some use. It's an open secret that, whilst I'd love you to licence VelOCRaptor, the current release will continue to function forever without a licence. The reason for that is that we want you to carry on using it until the engine works well enough. In the meantime, to quote Guy Kawasaki, we have embraced "Don't worry, be crappy"

Tuesday, 14 July 2009


Spurred by an email enquiry, I've added a page describing the various ways of integrating VelOCRaptor with other programs.

Friday, 10 July 2009


Yesterday Abbyy announced the release of ABBYY FineReader Express Edition for Mac. At the risk of promoting our most credible competitor, I've just bought a copy, and its accuracy is very good. They've also obviously worked hard to make it simple to use compared to its previous incarnation. You can't try before you buy, but if you need accuracy and will pay 89 Euros (they won't show me the price in $), it's the one I'd go for at the moment.

Thursday, 9 July 2009

Memory Leak Fixed

After way too long trying to find a solution, VelOCRaptor now deals with multi-hundred page PDF documents, if you have the patience!

The issue turned out to be 2 leaks, one in the code that writes extracts images from PDF files to feed to the engine, and one in the code that writes PDFs from those images once the reading is done. Both were easy to diagnose, but hard to fix, as they were symptoms of bugs in Mac OS rather than my code. But nobody cares - you just want software that works - and now it does work a lot better.

[Edit] You can download the update (it is build 195) by using 'Check For Updates' on the VelOCRaptor menu

Wednesday, 8 July 2009

It's Quiet, Too Quiet

So what's been happening? Well a steady stream of reports of the same bug - we run out of memory if you try to read from PDF documents with hundreds of pages - and some great feedback and suggestions via UserVoice.

I was surprised when people first used VelOCRaptor on large PDF documents, but then I'd had my mind in the world of little scanners, and reckoned without the Internet. So people have been trying to push whole PDF books through the thing, and it breaks. It took very little work to find the source of the memory leak, but fixing it is another issue. Basically the RubyCocoa system we use for the guts of the PDF reading and writing isn't up to the job, and I'm having to re-write much of that code in Objective-C - the language of Mac OS X. It's irritating, but just a fact of programmer life, so I'm biting the bullet and getting on with it. Wish me luck.