I’m starting to think I’m a masochist. Examine the evidence:
- I’ve been in 2 serious relationships with crazy (and I do mean crazy) girls — both ended crappily.
- I still hand wind my guitar strings.
- I’ve worked with PHP since version 3 and I’ve had a job doing it for 5 years. I actually gave up a job doing embedded systems with C to keep doing PHP work.
- I keep thinking I want to learn Erlang.
- Lastly, and I think this is the big one, despite the fact that I’ve worked on no less than 3 failed IRC clients — I started working on an IRC bot project.
If you didn’t know, IRC was invented in 1793 by the world-renouned horrible person, the Marquis De Sade. He was so horrible, in fact, that the term sadism comes from his name. The work I’m referring to is RFC1459 — arguably his most notorious and mind-boggling work. RFC1459 is part of the reason some people consider De Sade to be the devil incarnate.
RFC1459 was so bad that the IETF has refused to standardize it. An attempt was made in 2000 to update it with RFCs 2810, 2811, 2812, and 2813. After reading all of them, I can’t really tell what was updated. It’s more like someone took the original document of unholy sin and smeared it across 4 others. And even after this act of moral indecency, it’s still not standardized. It’s not helped by the fact that the “updates” weren’t even supposed to be considered a new standard and also pretty much every IRCd and client does whatever the hell they want.
Here are some things I really, really hate about IRC. I mean other than it being a complete total stab in the groin…
Ident
Does anyone actually have an actual ident daemon running? And does it actually help anything? It’s pretty much a useless addition — and why it’s mentioned in the IRC protocol (and by protocol, I mean random suggestions) I could only speculate.
Mode strings
This is probably the worst abomination of the entire spec (and by spec I mean loose collection of internet bowel movements). So there are user modes and channels modes. A mode character can have parameters or not and a mode string can consist of any of the mode characters or a bunch of them. Because of that, you end up with this kind of crap:
MODE #liek :+om-v+b nick1 nick2 *.aol.com
Which sets nick1 as an operator, sets the channel to moderated, takes voice away from nick2, and bans aol users from the channel. Look at sections 3.1.5 and 3.2.3 in RFC2812 for some more awful ideas. If you think parsing that is not so bad, then you should know that not every IRC-network is going to have the same modes. And to figure out what modes a channel might have…
005
RFC2812 says that numeric reply 005 is, “Sent by the server to a user to suggest an alternative server.” It’s just not true at all. In all my years of working with IRC, I’ve never seen 005 used in this capacity.
005 is used for relaying the server configuration to the client. This is a separate, 5-year old document that explains the ISUPPORT reply. Of course, it’s not official or anything. It’s not even really called ISUPPORT. The page I linked to at least tells you what IRCds are doing what. Keeping track of all that configuration info is unbelievably tedious.
NAMES
Channels on IRC are usually small, but on the off-chance that you find yourself in a warez channel, you’ll be in a sea of people. When you join in the chat, the server will send you the list of other chatters, which can take quite a while depending on the legality of the channel you’re in. The problem:
The names list can be too large to fit in to one packet, so the server will break it up. The problem is that if you’re in a big room, there’s bound to be lots of activity and the server will broadcast whatever messages it’s supposed to no matter what it’s doing. So rather than queuing a JOIN notification while it’s processing the names reply, it’ll just send the notice right in-between packets.
Yeah, me too. I also have no idea what order the names list might be in. I have no idea if it’s random, alphabetized, or predicted by Nostradamus — there’s nothing in the spec (and by spec, I mean not-spec) about it.
So how do you know what the NAMES reply is? Unlike JOIN or QUIT or KICK, when you request information from the server, chances are you’ll get a numeric reply.
Numeric Replies
By numeric, they mean a string of numbers — not actually binary numerics. There are a few things I don’t understand about numeric replies.
Why do some things have a numeric reply instead of just the text? Take NAMES for instance or WHOIS. I thought it had something to do with if the client sent the message requesting a reply. But then why is 005 considered a numeric reply? Or 001? Both are sent by the server after a connection.
Where the hell did these numbers come from? There’s some basic organization — 0xx are informative, 4xx are errors, 3xx are requested info… sometimes… 2xx are used for… uh…
Beyond that, I can’t figure out why the NAMES reply is 353 and the “End of NAMES” message is 366. MOTD Starting is 375, the actual MOTD packet is 372 and MOTD Ending is 376.
Additionally, when mentioning what reply is generated from what message, the spec (and by spec, I mean gelatinous blob of useless goo) refers to the numeric codes by a name! Sending a WHOIS on a user generates a numeric reply of RPL_WHOISUSER — which is actually numeric 311. This makes things exceptionally hard to follow and you’re better off experimenting with viewing the raw output from the server.
Security
For a while, there seemed to be a trend in having SSL connections to servers. It makes sense for say, HTTP or FTP — *real* protocols I mean. But because SSL connections are not required for every client, IT BASICALLY MAKES THE WHOLE THING POINTLESS!!! Yeah, now I’m yelling!
You know those popups that you get in your web browser that say, “This page contains both secure and non-secure items.” — You have the option to ignore the non-secure items. If you had that option on IRC, it would basically ignore 90% of the people on there.
Packet Length
It bothers me when connection oriented protocols (and by protocol in this case, I mean…. you get the idea…) reimplement stuff that TCP already does. Actually, I think the IRC protocol is the only one that does this.
TCP already breaks up things in to packets and lets you know when the stream terminates. Think about HTTP, you send the GET request, then stream everything down and then it terminates and you’re done. If you’re getting a 2MB html doc (god forbid), you don’t get it all at once — but no HTTP client on earth has to check for a line termination every 512 chars and then check to see if it sent the “End of HTTP Stream” message.
Unicode and I18N
The Marquis de Sade, being the awful, evil man that he was made no plans for IRC to ever be in anything but 8-bit ASCII. Which is weird because he said that hundreds of years before 8-bit ASCII encoding was invented. I have yet to find *any* information in 10 years, whether official or “official,” about how Unicode works with IRC. I’m sure non-ascii characters can be sent — but as for the server encoding, BOM, etc… it’s a mystery.
CTCP and DCC
At some point, some genius decided that instead of extending the protocol that extensions could be added by a hack in to an existing message.
There is, of course, yet another document about this “part.”
Trying to add support for these basically makes your handler for PRIVMSG 9000x bigger than it was. Also, you give yourself a massive headache and you start to think, “I must be a friggin’ masochist to be working on this.”
The biggest crap-sandwich that one must eat is that the Client-to-Client Protocol goes through the Server. I’m gonna start yelling again. How about a rule that says you can’t call it a protocol if all you do is wrap some capitalized letters with \001? I’m gonna make an HTTP variant called FATTP (Fucking Awesome Text Transfer Protocol) which takes regular HTTP verbs and wraps them with \001. Fuck you.
But DCC takes the cake. Not only is it ass-backwards and full of errors — once again, you must reimplement parts of TCP stream control in order for it to work right. DCC Send is the real culprit — and it was supposedly replaced by DCC XMIT in 1997, but good luck finding anything that uses it.
The CTCP-97 document intends to offer an official spec for formatting codes. The problem is…
mIRC
Currently the biggest piece of crap in IRC today is mIRC. For some reason, it is the standard for IRC clients. It uses a broken color formatting scheme, old DCC code, horrible scripting, a poor interface, and yet, you still have to pay for it. There were many, many times when during the development of Zephyr (the IRC client I was working on from 2002-2004), we came to a point and said, “We don’t really know what to do here… What does mIRC do?”
So, given all these reasons, why can’t I convince my friends to move to jabber or something? Don’t they know what kind of state it puts me in?
At least this time, I’m only writing a bot. And, aside from reading the IRC RFCs… *sigh* I’m enjoying it.
See? Masochism.
I’ve noticed a lot of math/comp-sci guys end up dating crazy girls. Or at least in retrospect they seem crazy.
i think it is how our brains are wired. to do what we do in front of terminal all day every day and not go on a kiling spree we have t be a bit different. and crazy females sense that. so i blame women for taking advatage of us.
Hahaha! Well put…