The good nerd underground
Lots of data but no answers
The good nerd underground
I was thinking we need one during our meeting at work today.

I work in the web development group at my company. Its a cool group. Our product management team are good people too, and clueful about technical things. We meet with them every thursday.

It came up for the umpteenth time today that some of our clients are screen scraping one of our web products to get data that they could much more easily and reliably get in a useful form from one of our other products, but that would mean that they'd have to buy the other product. This comes up because of the complaints that we get from said clients when we make certain changes -- changes that a normal web user would either not notice or consider to be a vast improvement but would break a fairly dumb screen scraper. Of course we're not the ones in charge of deciding whether or not to yell at them for screen scraping.

Now it happens that when it came up today one of the companies mentioned as a possible culprit happened to be the same company that twe and harrock work for. It occured to me that they'd probably also be irritated if they knew this was going on because not only is it a bad practice, but they're smart enough to know that in the long run it would be better and probably even cheaper (when you factor lost time due to us changing the layout of the pages) if these people would just go ahead and buy the darn service that actually does what these people want. It would be great if I could just say 'tell your people to cut it out'. Sadly they probably don't deal with the people who use fixed income security data, and even if they did, stupid stuff like this happens because the people in charge often won't authorize the purchase of the products and services that people actually need. Sigh.

It was a nice fanatasy, for five minutes, thinking that the cool group in my company could band together with the cool group in theirs and fix this. Then reality set in and I realized just how few people listen to the cool group.

5 comments or Leave a comment
twe Date: September 22nd, 2006 02:22 pm (UTC)
Screen scraping is not one of the things I have as yet run into here, which is not to say it does or doesn't exist. We probably haven't dealt with the folks you're talking about, then again, it possible we have and haven't realized it; many of the people I deal with I've never met in person. Not that I haqve any authority to tell random people to "cut it out." :)
greyautumnrain Date: September 22nd, 2006 02:43 pm (UTC)
Yeah, I'm sure you don't, and my impression was that you dealt with much more complex and hardware related issues, whereas this is just data software. Theoretically the people who should be doing something are our marketting and legal people, but they don't seem to be doing it. The people who are screen scraping probably haven't told any other group that they're doing it either, even the people they ask to complain to us every time we change the way a drop down menu works because it causes unspecified problems on their end. I was more expressing exasperation with the sort of stupidity that leads to this sort of thing and then allows it to continue more than anything else.
twe Date: September 22nd, 2006 06:45 pm (UTC)
my impression was that you dealt with much more complex and hardware related issues, whereas this is just data software

I'd say the issues we deal with are mostly software, and quite often vague or badly expressed whe they are initially presented to us. Or perhaps that's just the end of a hard week talking.
treptoplax Date: September 24th, 2006 08:43 pm (UTC)


Oh, we have all kinds of screenscrapers, and you deal with them all the time. Mostly it's 3270-emulator screen-scrapers, though, putting a pretty GUI on character-mode mainframe apps (ie, all that stuff where, "oh, look, display as EBCDIC and suddenly the bytestream makes sense".)

We even have at least one UNIX one (fun fact; telnet disables Nagle's algorythm to cut apparent latency, and so is not really well-suited for long-distance bulk data transport).

I dunno about web-scrapers, but it wouldn't surprise me.

NB: Software and staffing are probably out of different budget buckets, and people would rather spend from the latter than the former...
harrock Date: September 25th, 2006 05:03 pm (UTC)


I've been darn close to writing code to scrape the sites of internal groups, because it's easier than getting them to give me access to their databases. I wouldn't have the temerity to complain if they changed their format. He who lives by the sword dies by the sword. :)

Sounds like someone has taken the cost-control mantras a bit too much to heart. The inevitable result of that kind of thing is to make your temporary kludges permanent, and ultimately dumping enormous numbers of man-hours into maintaining said kludges, and complaining bitterly whenever someone interferes with them--making it harder and harder to move anything forward.

Think of yourselves as the forest fire that wipes out the canopy so that new growth can begin. Of course the little forest animals will wail and gnash their teeth, but the ones who can adapt will do fine.
