If you’ve been following my Twitter account at all, you’ve probably noticed by now that I’ve become an avid mobile device (i.e. smartphone) user, and a fan of Android in particular. This isn’t just a passing phase for me, nor is this a technology fad that’s just going to fade away. Mobile technology is really taking off, and I wouldn’t be surprised if a paradigm shift won’t occur—if it hasn’t already—where more people will be using smartphones and mobile devices to access the Internet and other online services than using a full desktop or laptop. There are other contenders vying to be our one-and-only window to the digital world, like set-top boxes, digital TVs, and such, but nothing is as personal and portable as the smartphone and its bigger brother, the tablet.
That said, I’m not in the camp that believes that the Web is dead and that mobile apps are the way of the future. I’ve expressed my feelings on that here before. Apps won’t and can’t be the end-all, be-all interface to data and the mobile Web will always have a place. Thus the mobile browser is one of the most important apps a smartphone can have. That said, most browsers on smartphones are anemic, underpowered, and severely lacking in important functionality. Smartphone manufacturers and OS authors want us to believe that we can leave the laptop behind and work entirely from that wondrous miracle in our pocket, but fail to deliver the tools we need to make that dream a reality.
My case in point: client-certificate authentication. As a very brief summary, the entire industry of e-commerce rests entirely on a set of encryption technologies such as HTTPS, SSL, TLS, etc., that allow secure, private communication between a client (such as an online shopper) and a server (an online store). The server authenticates itself to the client by using a digital certificate, signed by a trusted certificate authority which has investigated and authenticated the server as a legitimate entity. The client can rest assured that the server belongs to the authenticated entity because the certificate uses strong public-key cryptography to provide a chain of trust back to the authenticating authority. Without this technology in place, we wouldn’t be able to tell legitimate businesses such as online retailers and banks from the phishing scams so prevalent on the Web. (This doesn’t always solve problems between the keyboard and the chair, of course, but it is effective as long as the wetware interface is working properly.)
But digital certificates can be used to authenticate the client as well as the server. Many businesses and governments use client certificates to authenticate users to secure systems. For example, I use a government-issued Smart Card to authenticate with my client’s servers. On this card is chip that contains my digital certificate, signed by a private certificate authority. When I authenticate with the client’s services, the private key on the card creates a digital signature which the server can authenticate against my public key, the inverse of what happens between the online shopper and the store front. Thus, I can trust the validity of the government’s certificate and know I’m connecting to their servers and no one else, and they in turn can validate that I (or the person who has my card) am who I say I am and let me in. I use a similar technology with GPF, although I import my certificates directly into the browser rather than use an external card. I created my own private certificate authority and issue client certificates to each browser I wish to use to access my admin interfaces. That way, I know only certain machines can access those portions of the site, offering a lot more security than just a simple password can provide.
This isn’t a new technology. SSL has been around almost as long as the Web itself, and it wasn’t long before the model was flipped around to authenticate clients to servers as well as servers to clients. This is a tool used by businesses every day all over the world. Every desktop browser supports client certificates because they are a standard. Any browser that doesn’t support them is likely to be overlooked or ignored in favor of browsers that do.
Yet the support for client certificates on mobile devices is appallingly absent. I know the built-in Android browser doesn’t support it, and I created an issue in Google’s official Android issue tracker to complain about it. Android supports client certs for WiFi authentication, but not in the browser, e-mail, or any other key service vital to secure business communications. Supposedly support for this functionality is going to be added in future versions of Android, but that doesn’t help me or any of the millions of current Android users until it comes time to upgrade our devices. I’ve read in various places that the iPhone supports client certs, but I’ve never been able to get any of the solutions to work with my iPod Touch (essentially an iPhone minus the annoying contract and poor service of AT&T). The only success I’ve had in this area has been with Firefox Mobile, which is pretty much a Firefox 4 release candidate smooshed and crunched down to fit on a mobile device. It’s bloated and a lot slower than Android’s built in browser and there’s no handy UI for importing certs like there is on the desktop, but if you take a sledgehammer to it and do some manual file tweaking, you can import your client and CA certs into the certificate database and use it effectively.
Seriously, guys… you want your devices and mobile OSes to be taken seriously by businesses as tools to take our work out of the office and on the road. Yet, you don’t give us the essential tools required to take advantage of this amazing freedom. Sure, you tell us “there’s an app for that”, but frankly, there isn’t. I’ve looked, and they’re not there. Apple won’t let third-party browsers compete with Safari on iOS and none of the Android add-on browsers support client certs either. Only Firefox, a desktop browser masquerading as a mobile app, comes close, and it takes a bit of technical wizardry to do something that should be a quick five second import. Someone’s got to step up to the plate and make some progress here, or no business that really understands security is going to take the mobile space seriously.
By now, the tech savvy among you have probably heard of Firesheep, the infamous unofficial Firefox plugin that lets you swipe other people’s session cookies and impersonate them on various popular, less-than-secure websites if you and they share the same unencrypted WiFi access point. The less tech savvy ones probably could care less, or are so terrified and spooked that you’ve turned off and unplugged your computers, buried them in a 20-foot-deep hole in the backyard, and layered on top of them concrete, asbestos, Kevlar, lard, and ten thousand old AOL CDs you’ve been hoarding in the closet since 1990.
OK, I was only kidding about the lard.
Last week I tweeted that “Firesheep makes me want to weep for the Internet and laugh maniacally, both simultaneously”. That’s no exaggeration. On one hand, it’s performing wonders by raising awareness of just how insecure many of our favorite sites really are. The problem Firesheep exposes has been around for ages; hard-core hackers could perform all the tasks that this plugin does through readily available tools and a lot of dedicated logging and log scanning. What Firesheep does is take a complicated, hard-core hacker task and make it bone-headedly simple: install, scan, infiltrate. It provides a wake-up call to Web 2.0 developers that they need to look seriously at security rather than just pay it lip service. And at this task it seems to be doing quite well; already Google has made moves to force SSL for all GMail access and Facebook is mumbling under its breath that they’re “looking into it”.
What scares me about Firesheep is the bone-headedly simple aspect. I won’t get into the ethics of responsible disclosure of security flaws, but releasing a tool like this that makes such a questionable task as simple as clicking a button is bound to have repercussions. Putting this tool in the hands of everyone means putting it in the hands of everyone, no matter what color hat they wear. Yes, we’ll hopefully see lots of increase in security at many of the websites we use every day, but how many innocent and ignorant users will be maliciously attacked before those changes occur? The gun was a very useful tool for early pioneers to hunt and protect one’s family, but it’s also useful for criminals to steal, coerce, and murder their victims. Technology is inherently amoral; it is people that are moral or immoral.
I won’t go into the details of how Firesheep works or the many ways it can be easily thwarted. A quick spin by your favorite search engine will likely provide all the information you may need. However, I did want to take a few minutes to publicly analyze the various aspects of this site and the GPF site and reassure all my readers that your information should be reasonably safe. Right now, it looks like the person most likely to be impacted would be me, directly or indirectly, and the risks are actually pretty darn low.
First up, this site: Firesheep does indeed include information on how to “hack” WordPress. Well, how to hack WordPress.com. Since Neural Core Dump is self-hosted, the built-in attack against WordPress.com hosted blogs won’t affect us here. However, Firesheep is open source, so it is trivial to modify the code to attack specific domains, so the WordPress.com attack can be tweaked to attack an individual self-hosted WordPress blog. My original assumptions here proved to be incorrect; in looking back over the the Firesheep code, it doesn’t look specifically for WordPress.com domains, but for common cookie names used by all instances of WordPress, whether it’s self hosted or not. Thus, any logged-in user here could potentially be exposed. In this case However, this blog’s small size becomes its advantage; the likelihood that anyone will directly attack it is pretty low, and even then I keep extensive backups and can easily back out malicious comments or posts. (Mind you, being too small should not be used as an excuse not to be concerned, just that the threat can be downplayed for the time being.) I rarely use public, open WiFi hot spots (to be honest, there aren’t that many of them around where I live), and on the rare case that I do, it’s easy enough for me to create an SSH tunnel to my home Linux box and proxy all my HTTP traffic through it.
As for GPF, all logins occur over SSL, so no passwords are ever sent in the clear. Of course, Firesheep does not sniff passwords but rather session cookies, so this isn’t really the problem. I thought of a few scenarios where Firesheep could be used against GPF to varying degrees of success:
Again, GPF’s probably far too small a target for anyone to really bother with, but the fact is that so little attack surface is visible that the only person likely to be hurt by it is me.
There, I hope I laid all your GPF/Firesheep fears to rest. What was that? The only person really concerned about this was me? Oh… well, in that case… um… never mind, I guess.
UPDATED November 4, 2010: Updated the paragraph about this blog to correct an incorrect assumption about only WordPress.com blogs being affected.
This week an couple errors were reported in the custom CMS application I built at work a couple years ago. I haven’t touched this code in at least a year, so it took me bit to swap some mental virtual memory and recall how everything worked. I’m not sure if these “bugs” were something new that had manifested themselves after a recent platform upgrade or design flaws that had been there since the beginning only to be recently noticed. None of that really matters for the sake of this post, however. Suffice it to say there were two problems, one of which was likely to be entirely my fault but relatively easy to fix with a little bit of C# hacking.
The other problem was a bit obscure. The application is built in ASP.NET 2.0 and written entirely in C#. It also makes use of Microsoft‘s AJAX Toolkit for ASP.NET to “pretty up” some of the interface interactions. Unfortunately, one particular user began to experience problems with the system recently. Since she’s the project manager, needless to say the problem was escalated to top priority with little to no delay. To make things more difficult, the problem was especially cryptic. In true Microsoft fashion, the pop-up JavaScript error dialog offered little to no useful information:
Sys.WebForms.PageRequestManagerServerErrorException: An unknown error occurred while processing the request on the server. The status code returned from the server was: 500
Google, of course, is my friend and found no shortage of pages where this turned up. The odd thing was that none of the purported causes for the error were anything that I was using.
After much searching, I finally happened upon this site. It seems Ted Jardine hit the same problem I did. He had narrowed it down to something to do with the .NET session, which he wasn’t really using but I was using extensively. What I found most interesting was his solution:
So, based on one of the comments in one of the above posts, even though I’m not touching session on one of the problem pages, I tried a hack in one of the problem page’s Page_Load:
Session["FixAJAXSysBug"] = true;
And lo and behold, we’re good to go!
I followed the various links he provided—as well as Googling for “FixAJAXSysBug” itself—and found lots more anecdotal evidence to support its usefulness. I applied this “fix” to the common header of the application to make sure it took affect everywhere and, so far, all reports seem to indicate its success.
Needless to say, I was instantly reminded of this GPF strip from the crossover with Help Desk. I can’t remember now if that joke was my idea or Chris Wright’s. It doesn’t matter now, really… it audacity is as brilliant now as it was eight years ago. The idea of setting a simple Boolean flag to “turn off bugs” is something I will always find hilarious.
Now if only all Microsoft bugs were so easy to fix….
Here’s a clarification of my recent Tweet about Diana. Sometime over the weekend Diana, our primary Linux box that serves as the backbone of our home network (DNS, file server, internal Web server, SSH gateway, SVN repository server, etc.), gave up the ghost. I only discovered this yesterday evening, so I haven’t had much time to diagnose the problem. It’s almost certainly a hardware issue. I’m thinking it’s the power supply or the motherboard, as when I try to power her up, nothing happens. The power light comes on, I can watch the CPU fan twitch like it wants to start spinning, but otherwise nothing else visible occurs. No output makes its way to the monitor so there are no error messages to follow.
At this point, I’m not sure of the status of the hard drives. My hope is that they’re fine; the obvious problem appears to be occurring before they even start to spin, as if they’re not getting any power (and that’s why I suspect it’s a power supply issue). The good news is that Demeter, her predecessor, has been sitting idle and collecting dust and has since been rapidly pressed back into service. I should be able to slip Diana’s disks into Demeter, check their integrity, and hopefully recover the data. That’s the core thing right now, getting the data off; hardware is replaceable, data is not. The only hitch is that Demeter is old enough that I’m not sure her BIOS will read Diana’s larger disks. Demeter’s current HD is already larger than her BIOS supports, though, and Linux seems to work fine in this situation, so I’m hoping that won’t be a problem. A worst-case scenario might be to throw a live Linux distro into Athena, our current “alpha” Windows XP desktop, and try to grab the data that way. (Diana’s disks are in ext3, which obviously Windows can’t read.) Both Demeter and Diana have EIDE drives while Athena uses SATA, but I’m almost certain Athena also has legacy EIDE on the motherboard somewhere; if not, I’m hosed there.
Why might this be a concern to you? Well, for one thing, Diana was one of several redundant backup locations for storing my my high-resolution original strips. Fortunately, everything from Year Nine and back has already been backed up to multiple DVDs stored in multiple physical locations, while Year Ten’s files are stored across three redundant drives (two in separate physical machines and one external USB drive). More importantly, Diana was my SVN repository server, housing all the source code for the GPF site. I have working copies of that repository in multiple locations so I’m not hurting there, but with the repository down I’m stuck manually keeping those working copies in sync. The biggest problem that may affect you guys is the humongous time sink this will be for me to repair/replace Diana and get all our internal mechanisms working again. With my day job, two hours of commute, and toddler patrol vying for my time, my comic production schedule is severely squeezed as it is. This is probably going to impact that buffer I was forced to take a hiatus in December to reclaim as I wasn’t able to increase my production, just maintain the status quo.
For those of you who might care, I’ll post updates here when I can. More frequent cries of frustration will likely come through the Twitter feed. If the comic will be severely impacted, you’ll get something in the GPF News. So keep watching those RSS feeds.
Not long ago, I took advantage of a nifty WordPress plugin to enable XML sitemaps for the blog. For those who’ve never heard of XML sitemaps (I hadn’t for quite a while), they are little XML files in a specific format that give search engines like Google hints on how to index your site. They don’t necessarily improve your search rankings per se, but they help the search engine better decide what to index, when it was last updated, relative priorities of different pages, etc. You then throw a special line into your robots.txt file or directly submit the file to the search engine to let it know the file is available. Once the engine knows about it, it will check it periodically to optimize how the site is indexed.
The plugin, of course, makes this ridiculously easy for WordPress. However, GPF gets orders of magnitude higher traffic than the blog does, so finding a way to generate sitemaps there would be ideal. I toyed with the idea for a while until I finally sat down, examined the sitemap specification, and figured out how to roll my own code. It now successfully runs via cron each morning and gives a pretty thorough census of what’s available on the GPF server. The problem is that the GPF site is divided into several parts that are largely autonomous and self-contained:
Ignoring the forum, that left me three major sub-projects for creating sitemaps. It’s easy enough to segregate these into separate files and tie them together using a “sitemap index” file, so that wasn’t a problem. The archive would just be a formatted dump of the archive database, deriving approximate update times from the posting date. The bulk of the rest of the site could be done by stepping through the file structure of the site and taking note of every HTML or PHP file and its last modification time (conveniently ignoring certain files and directories that don’t need to be counted, like access-restricted Premium pages). And that leaves the wiki.
I managed to come up with a decent wiki sitemap routine that I thought I’d share, just in case someone else might be interested. Of course, it’s not likely to be useful for massive wikis like Wikipedia—sitemaps are restricted to 10MB in size and 50,000 URLs—but something small like the GPF Wiki would be easy to submit and index. It was built using MediaWiki 1.12.0; I am uncertain what database changes may be needed for older or newer versions. Here’s my current process:
I only want to index relevant pages, including category pages. The relevant database table for this is “page”. (How… convenient). Unfortunately, this table also contains things like redirects and images. Each image has its own “page” assigned to it; try clicking on an image in Wikipedia or in the GPF Wiki to see what I mean. The time stamp of the latest revision, however, is stored in the “revision” table, joined to the page table by the latest revision ID number. So a good starting bit of SQL would be:
select p.page_title, r.rev_timestamp from page p, revision r where p.page_latest = r.rev_id and p.page_is_redirect = 0 and p.page_title not like '%.gif' and p.page_title not like '%.png' and p.page_title not like '%.jpg';
Unfortunately, this also returns a few meta pages like the sidebar and editing pages. Before selecting, I define a look-up hash of titles I want to avoid and as I loop through the results I just skip those.
The title, of course, is both the displayed title and the input portion of the URL that uniquely identifies the page. Thus, knowing the base URL (http://www.gpf-comics.com/wiki/) I can easily reconstruct the public URL of any article from the title. As with Wikipedia links, spaces have already been converted to underscores, but the rest of the string needs to be be URL encoded. This is easy enough, so we can quickly build the full URL as required by the XML schema.
The time stamp is a little bit tougher. MediaWiki stores time stamps as a 14-digit number in YYYYMMDDHHMMSS format, always in UTC time. In Perl (in which almost all my crons are coded) this is easy enough to break apart and turn into a UNIX time stamp. I then output the date in W3C ISO 8601 format as required by the schema. A sample of a resulting entry would be:
<url> <loc>http://www.gpf-comics.com/wiki/Nick</loc> <lastmod>2008-08-22T06:00:07Z</lastmod> <changefreq>monthly</changefreq> <priority>0.3</priority> </url>
Change frequency and priority are purely guesses and fudges for mine. According to the sitemap specification, priorities are purely relative to other parts of the site. I rated the wiki pages as relatively low since the wiki at GPF is considered a “supporting” page and subordinate to things like the archive. As for change frequency, the sitemap specification includes a number of predefined choices (hourly, daily, weekly, monthly, etc.). Monthly was a purely off-the-cuff guess; some pages may update more or less frequently, but monthly would be a good average. It is entirely possible to rate select pages as higher priority or frequency than others, but I decided to take the easy route and rate everything the same. To apply different values, you just need to pay special attention to the title and assign a non-default value when that title crops up.
Well, I hope someone out there might find this helpful. I’m not sure if it really helps anyone find anything at GPF, but it was a fun little exercise nonetheless.
I hope to post more on this when there’s more data to post, but I thought I’d throw up a quick note stating that the latest episode of the Security Now! “netcast” features a question posed by yours truly. (The best part was listening to Leo Laporte stumble over my long-winded rambling.
) The high-quality version of the show can be found at the previous link; a low-bandwidth version as well as a text-only transcript can be found at the corresponding page at GRC.com. A search in the transcript for “Darlington” will take you to the beginning of my question; in the netcast, it starts around 38 minutes, 22 seconds in. (Of course, I encourage everyone to read/listen to the entire thing.)
For the full effect, though, you’ll also need to listen to/read the previous two non-Q&A episodes of the show, #149 and #151. (Low-bandwidth and trascriptions can be found here and here.) The entire dialog concerns the recent trend of ISPs selling out their customers to allow third-party advertisers to come in and install hardware at the ISP to facilitate tracking the ISPs’ customers’ surfing habits across sites. While the ad companies in question claim to not be recording personally identifyable information about the ISPs’ customers, the capability is there and the possibilities for abuse are enormous. It brings back many shades of the DoubleClick controversies of the late 1990s-early 2000s, only much more ominous. I provided a unqiue standpoint to the discussion: that of a Web developer hosting a site and encountering similiar mysterious “first party” cookies set for my domain but not set by me.
The full body my question is present, but I’m not completely satisfied with the answer.
Let’s just say I think Steve Gibson made an assumption about the GPF site that’s not 100% true. I’ve replied to his response with additional information. I don’t necessarily expect another response (he does, after all, have his own agenda to follow on his show), and even if he does it will likely be in episode #154, the next scheduled Q&A episode. If anyone is interested, I’ll post updates if and when this occurs. If I don’t get a response, I’ll post my response here, especially since it contains some disturbing observations about “first party” cookies that have mildly paranoid folks like me nervous. (I’d hate to see what it does to really paranoid people.)
So ICANN, the organization that oversees the doling out of domain names on the Internet, has approved the relaxation of the rules for top-level domains (TLDs) to allow for arbitrary TLDs for whoever has the money and technical capability to grab it. If things go according to plan, by the middle of next year you may be able to just type into your browser something like http://search.google/ rather than http://www.google.com/, or perhaps you’d rather http://drink.coke/ or http://drive.ford/ or even http://have.crazy.monkey.sex/.
To quote virtually ever character in the Star Wars universe, I have a bad feeling about this.
I am so sitting on the fence on this one. My initial gut reaction is this can’t be a good thing. I know far too many non-techies who are confused by Internet addressing as it is, so let’s confuse them some more by adding even more things for them to figure out. JD Fraizer over at User Friendly hit the nail on the head; anyone who has ever used Usenet is probably rolling their eyes a lot more lately. The potential for cybersquatting and trademark dilution is enormous. ICANN insists that an “objection-based mechanism” will be in place to prevent such things, but how much red tape (and legal dollars) will someone have to go through to protect their brand? Every day that a squatter sits on a domain equates to valuable time, money, and reputation that can be lost, something big corporations may be able to wait out but little guys like me can’t afford. It’s been hard enough right now for me to keep up with all the variants of gpf-comics.something out there. And let’s not get into the discussion of what “offensive” TLDs creative individuals might come up with….
Of course, it’s not like I’m going to be registering .gpf anytime soon anyway. I suppose that’s one thing ICANN did right: to create your own TLD, you’ll need a truck load of money first. The CBC is reporting an estimated $100,000 per TLD—I have no idea if that’s Canadian dollars or not—but ICANN only says for now that “fee information is not yet available”. Ordinary domain names are dirt cheap nowadays, which is a blessing to small-time operators like me but a curse in that squatters with cash to burn can snap up thousands at a time and hold them for ransom. At least starting a new TLD will take capital, making it a serious investment. It will also be quite a technical undertaking; owning a TLD also means you have to build the infrastructure support it. So if Google were to grab .google with their pocket change, they’ll also need to pony up the hardware and bandwidth to maintain the root server. Google may be a bad example (they’ve got servers to spare, I’m sure), but for organizations not used to maintaining that kind of “big iron” it will be a significant learning curve.
But then it occurred to me… how awesome would it be if all your favorite comics or comic-related sites could found at “something dot comics”?
Imagine if you will that some philanthropic comics creator/reader with a hundred grand in “mad money” under his bed were to snatch up .comics and register that with ICANN. Being philanthropic, this individual would charge a minimal fee to register a domain there, just enough to cover operational costs and maybe make a modest living in the process, aggregated out to anticipated demand (of which I’m sure there’d be plenty). There would be only one additional requirement for application beyond the current standard (ethical) process: the domain must be used for a site publishing, promoting, or discussing comics in some way, shape, or form. Consideration for approval would require proof of content, such as a preview development site, previously published work, portfolios, etc.—just enough to prove the site really will be used for something comic-related. Individual titles would be encouraged to register at the root level (dilbert.comics, gpf.comics, x-men.comics) while companies would register their names (dc.comics, marvel.comics, keenspot.comics) and potentially use sub-domains for their own titles (x-men.marvel.comics). Our hypothetical philanthropic registrar would also be fair and balanced as to not let big conglomerates dominate the little guys. Disputes over domains would come down to traditional copyright and trademark resolutions, requiring proof of prior art, etc.
Wouldn’t that be just grand?
Of course, what will really happen will be that some big company will come along and buy up .comics with far more misanthropic intentions (and we know such an obvious TLD wouldn’t sit dormant for long). They’d either squirrel it away selfishly for promoting their own works and no one else’s, or they’ll charge such an exorbitant “premium” price for registrations that only big publishing houses like DC, Marvel, etc. will be able to afford it, shutting out the little independents and webcomics. Even if they price it fairly and keep it open, I’d bet it would get so swamped with squatters that the novelty of the whole TLD would become as diluted .info is today. Maybe it’s just that I’m pessimistic… or that I’ve been annoyed for so long that some jerk had been holding gpf-comics.org hostage for years… but I just don’t see this turning into as promising a possibility as I think it could be.
Oh, well. I’ve been waiting for gpf.com for nearly a decade now. I guess I can just add gpf.comics to the list. Wishful thinking….
The new GPF site has been running live for half a month now, and I’m proud to say things have been running incredibly smoothly. That is, at least, from my perspective; I haven’t seen any major glitches, and aside from a few typos in the comic (which are obviously independent of the site code), nobody has written me about any problems. This is especially heartening because the new site was pretty much entirely coded by hand by me, sans a few bits and pieces. (I can’t take credit for the OS, the web server software, the database engine, or the forum. But everything else… yep, that was me.)
There were a lot of motivations for writing my own archiving system, but the primary one was efficiency. While I considered trying something off-the-shelf, so to speak, like ComicPress or Drupal, I really wanted something that would be blazingly fast yet still dynamically generated to let me do things like GPF Premium on the server side, primarily for security reasons. (Server-side processing means no messy JavaScript is required by the users, thus exposing them to less risks, while Premium content doesn’t even get sent to the browser at all if Premium isn’t enabled.) So the GPF site is optimized out the wahzoo, with certain high-volume pages built once by nightly crons while others that require more interactivity reduce database queries to simple selects as much as possible. I’m never one to brag and toot my own horn, but I’m actually pretty proud of the new site and how responsive it is.
Of course, I can’t really take all the credit. I do have to give some serious props to XCache.
For those unfamiliar with PHP, it is one of many server-side, interpreted scripting languages commonly used for dynamic Web site development. The caveat, however, to any interpreted language is that on each request the source script must be read, parsed, compiled, and executed before anything is set back to the end user’s browser. This is one reason why dynamic sites are and will always be slower than serving purely static HTML files. Static HTML just needs to be read and regurgitated; anything that requires the Web server to actually think takes more time. Add to that the fact that there could be hundreds or even thousands of requests all competing at once for content and it’s a miracle anything get served at all.
XCache is one of several opcode caching extensions for PHP. Essentially, when the first request for a script is made, the script is parsed and compiled as usual. However, XCache stores the compiled code so subsequent requests can skip the parsing and compilation steps and go directly to executing the code. This significantly increases the speed of execution by eliminating one of the costliest parts of the process (except perhaps database connections). In addition, XCache also includes the ability to cache variables and objects, so commonly repeated and expensive variable generation–such as the cryptographic hashes I use for salting cookie hashes or database look-ups for common elements like the Premium subscription levels–can be stored in the cache rather rebuilt on each request.
I was first introduced to XCache by the XCache for WordPress plugin, which was probably mentioned in one of the development feeds built into the WordPress dashboard. I’ve been running this combination here on the blog for a little while with moderate success; I’m still trying to find a good balance of configuration settings to get the best results, but I’ve been happy with the results so far. Without putting much thought into it, I went ahead and installed XCache on the GPF server, hoping that it would help even if I never got a chance to optimize it. Fortunately, it has helped, and now that I’ve optimized the settings it’s exceeded most of my expectations. I’m not sure if there’s something about my code that caches better than WordPress, but GPF has done much better with XCache than the blog has.
Admittedly, I haven’t compared it to any other opcode cachers, nor have I benchmarked it against any of the competition. That said, however, I heartily recommend it to anybody running PHP applications. To get the greatest benefit, you may need to modify some code (or install a plugin if you’re using a prepackaged application) to take advantage of the variable/object caching. But even without modification the opcode caching alone makes for a vast improvement.
Not sure if anyone noticed, but both the blog and the new GPF beta test site were down last night. Our hosting service, Slicehost, informed us that a breaker blew in their data center and they were forced to bring a number of machines down to protect them. In addition, the blog server (which also hosts a number other private sites I run) stopped responding, so they had to reboot it again.
Unfortunately, while Slicehost was very informative and sent me several e-mails to keep me apprised of the situation, the sites continued to be down until early this morning. That’s when I discovered that for some bizarre reason the MySQL and Apache services were not configured to start at boot time. This is baffling, in my opinion, as I thought this was automatic with Fedora. You install the application package and, if it’s a service like this, it also installs the appropriate links in the init directories to make sure the services start on boot. Not so, apparently. I’m not sure if this is Fedora’s fault, Slicehost’s, or mine, to be honest, but it should be fixed now.
There’s one part of me thinks that this outage is an ominous sign on the eve of my leaving Keenspot. Then again, it also helped me catch a critical flaw that would have been extremely annoying if it happened a week later, after the move when thousands of readers would be hitting the new site. So I don’t know whether to be paranoid or relieved. (O_O)
Anyone interested in the history of webcomics should check out this week’s episode of the This Week in Tech (TWiT) podcast. Especially since it has nothing to do with webcomics.
Here’s my line of reasoning: In this episode, Leo Laporte and his unusual round of suspects are joined by Jonathan Coulton, geek musician extraordinaire. Aside from discussing a few topics of current note (like the death of HD DVD), they discuss a recent concert by Coulton where Leo and company joined him to play Rock Band before a nerd-filled audience. They go on to talk about the “new” Internet phenomena of niche entertainment targeting–skipping the big, mass-market blitzkrieg typically used by music, TV, and movie studios and canvasing thousands or millions of potential customers, to instead go directly to your core fans, the few dedicated people who are the ones that will really appreciate what you do. Coulton talks of making a living catering to a small handful of hard-core fans and how this is much more fulfilling that the big media alternative, where both the artist and the audience are faceless statistics on the bottom line of a balance sheet. And they discuss this with such freshness and enthusiasm, as if this is were the next new thing, some epiphany that no one has yet uncovered.
What I find so funny about it is… those of us in webcomics have already been doing this… for years.
I’ve noticed this a lot over the past near-decade of GPF‘s existence. Blogs, podcasts, and other forms of grass-roots media have all cropped up during that time, putting publishing power in the hands of the masses, becoming “innovative” and “groundbreaking” in bringing content production to the people. But a fair number of “new” trends (and problems) associated with these technologies are things I remember seeing crop up among webcartoonists several years before. Long before the term “blog” was coined, I remember chatting with other cartoonists on mailing lists and news groups, swapping ideas about search engine optimization (before that term was coined as well), getting and retaining readers, how to monetize your site, etc. It’s entertaining now to watch many tech headlines to see “fresh” ideas crop up that I’ve personally tried–and abandoned–a couple years before. It’s like the wheel reinventing itself every couple of years, only with different colors and/or materials.
Of course, I would never be so conceited to believe webcomics “did it first.” Webcomics themselves borrow heavily from the underground comics movement of the 1950s, 60s, and 70s, where small independent publishers ducked under government sensors to push out innovated and controversial content directly to the people who wanted them. What changed between then and now is that the interconnectivity of the Internet moved this from basements and back rooms to hidden mailing lists and chat rooms, eventually making its way to the mainstream, all while expanding the sphere of availability from isolated pockets of common interest to global reach. It would also be naive to believe this flow of “innovation” is one-way; RSS and other syndication technologies took off first in the blogosphere, and was only later ret-conned and shoe-horned into webcomic automation systems as a handy update notification system.
Perhaps one of the reasons bloggers and podcasters didn’t learn any lessons from webcartoonists is the difference between skill level–real or perceived, take your pick–required for entry. Cartooning obviously requires some level of artistic talent as cartooning, in all of its myriad of forms, is a form of art. It’s often a commercial art, intended more to generate revenue than anything else, but an art nonetheless, conveying ideas and emotions graphically. And while a well-crafted blog certainly requires a talent for writing, that is often easier to come by than the ability to both write and draw. Thus the critical mass of webcartoonists is much smaller than that of bloggers and podcasters, making it less noticeable to the mainstream. That’s also why “break-out” blogs now seem to be a dime a dozen, but it’s still major news when an online comic gets noticed by big media and gets optioned for TV/movie deals. Everyone knows about blogs and maybe even reads a few, but there are other comics on the “intraweb” besides Dilbert?
I’m not sure if there’s anything useful to these observations, other than the fact that they amuse me occasionally and it gives me something to post about. I’m not sure if anyone else has made these kinds of observations or, for that matter, anybody else cares. But I’ve often wondered if those underground cartoonists of yesteryear thought to same way about us webcartoonists as I have about bloggers. I’d like to think so, just because it creates a nice symmetry. I can’t wait for bloggers to sit around in the old bloggers’ home, thinking such thoughts about whatever comes next. “Those kids with their holocasts… if they had learned the lessons we did about AI search, they’d be raking the quatloos by now….”