If you’ve been following my Twitter account at all, you’ve probably noticed by now that I’ve become an avid mobile device (i.e. smartphone) user, and a fan of Android in particular. This isn’t just a passing phase for me, nor is this a technology fad that’s just going to fade away. Mobile technology is really taking off, and I wouldn’t be surprised if a paradigm shift won’t occur—if it hasn’t already—where more people will be using smartphones and mobile devices to access the Internet and other online services than using a full desktop or laptop. There are other contenders vying to be our one-and-only window to the digital world, like set-top boxes, digital TVs, and such, but nothing is as personal and portable as the smartphone and its bigger brother, the tablet.
That said, I’m not in the camp that believes that the Web is dead and that mobile apps are the way of the future. I’ve expressed my feelings on that here before. Apps won’t and can’t be the end-all, be-all interface to data and the mobile Web will always have a place. Thus the mobile browser is one of the most important apps a smartphone can have. That said, most browsers on smartphones are anemic, underpowered, and severely lacking in important functionality. Smartphone manufacturers and OS authors want us to believe that we can leave the laptop behind and work entirely from that wondrous miracle in our pocket, but fail to deliver the tools we need to make that dream a reality.
My case in point: client-certificate authentication. As a very brief summary, the entire industry of e-commerce rests entirely on a set of encryption technologies such as HTTPS, SSL, TLS, etc., that allow secure, private communication between a client (such as an online shopper) and a server (an online store). The server authenticates itself to the client by using a digital certificate, signed by a trusted certificate authority which has investigated and authenticated the server as a legitimate entity. The client can rest assured that the server belongs to the authenticated entity because the certificate uses strong public-key cryptography to provide a chain of trust back to the authenticating authority. Without this technology in place, we wouldn’t be able to tell legitimate businesses such as online retailers and banks from the phishing scams so prevalent on the Web. (This doesn’t always solve problems between the keyboard and the chair, of course, but it is effective as long as the wetware interface is working properly.)
But digital certificates can be used to authenticate the client as well as the server. Many businesses and governments use client certificates to authenticate users to secure systems. For example, I use a government-issued Smart Card to authenticate with my client’s servers. On this card is chip that contains my digital certificate, signed by a private certificate authority. When I authenticate with the client’s services, the private key on the card creates a digital signature which the server can authenticate against my public key, the inverse of what happens between the online shopper and the store front. Thus, I can trust the validity of the government’s certificate and know I’m connecting to their servers and no one else, and they in turn can validate that I (or the person who has my card) am who I say I am and let me in. I use a similar technology with GPF, although I import my certificates directly into the browser rather than use an external card. I created my own private certificate authority and issue client certificates to each browser I wish to use to access my admin interfaces. That way, I know only certain machines can access those portions of the site, offering a lot more security than just a simple password can provide.
This isn’t a new technology. SSL has been around almost as long as the Web itself, and it wasn’t long before the model was flipped around to authenticate clients to servers as well as servers to clients. This is a tool used by businesses every day all over the world. Every desktop browser supports client certificates because they are a standard. Any browser that doesn’t support them is likely to be overlooked or ignored in favor of browsers that do.
Yet the support for client certificates on mobile devices is appallingly absent. I know the built-in Android browser doesn’t support it, and I created an issue in Google’s official Android issue tracker to complain about it. Android supports client certs for WiFi authentication, but not in the browser, e-mail, or any other key service vital to secure business communications. Supposedly support for this functionality is going to be added in future versions of Android, but that doesn’t help me or any of the millions of current Android users until it comes time to upgrade our devices. I’ve read in various places that the iPhone supports client certs, but I’ve never been able to get any of the solutions to work with my iPod Touch (essentially an iPhone minus the annoying contract and poor service of AT&T). The only success I’ve had in this area has been with Firefox Mobile, which is pretty much a Firefox 4 release candidate smooshed and crunched down to fit on a mobile device. It’s bloated and a lot slower than Android’s built in browser and there’s no handy UI for importing certs like there is on the desktop, but if you take a sledgehammer to it and do some manual file tweaking, you can import your client and CA certs into the certificate database and use it effectively.
Seriously, guys… you want your devices and mobile OSes to be taken seriously by businesses as tools to take our work out of the office and on the road. Yet, you don’t give us the essential tools required to take advantage of this amazing freedom. Sure, you tell us “there’s an app for that”, but frankly, there isn’t. I’ve looked, and they’re not there. Apple won’t let third-party browsers compete with Safari on iOS and none of the Android add-on browsers support client certs either. Only Firefox, a desktop browser masquerading as a mobile app, comes close, and it takes a bit of technical wizardry to do something that should be a quick five second import. Someone’s got to step up to the plate and make some progress here, or no business that really understands security is going to take the mobile space seriously.
In the ongoing spirit of releasing pointless Open Source software, I semi-proudly announce the release of Cryptnos 1.0 for Microsoft .NET 2.0.
So what is it? Cryptnos is a secure password generator. By now, I’m sure many of you have heard of various programs, especially browser plug-ins, that let you generate unique passwords for all your various online logins. They usually do this by combining the domain name of the site with a master password you supply, then run those inputs through an MD5 hash to give you a “strong” password that is unique for that site. Many of these applets also search the page you’re currently on for the login form and attempt to pre-populate the password box for you. Well, Cryptnos is kind of like that. Only it’s not.
Like these other apps, Cryptnos generates a password from your master password and from some mnemonic or “site token” that you supply. But that’s where the similarities end. First of all, Cryptnos does not live in your browser, so it can be used for any application where you need a strong password. As a corollary, the mnemonic does not have to be a domain name, although it certainly can be; it can be whatever you want it to be, so long as it is unique and it helps you remember what the password is used for. Next, Cryptnos gives you unparalleled flexibility in how your password is generated. You’re not stuck using just MD5, a broken cryptographic hash that is horribly out of date and which should no longer be used. You can select from a number of hashing algorithms, as well as how many times the hash should be applied. Crytpnos also uses Base64 rather than hexadecimal to encode the output, meaning your generated passwords can have up to 64 possible options per character instead of 16, making it stronger per character than the other guys. You can further tweak your generated password by limiting the types of characters used (for those times where a site requires you to only use letters and numbers) and the length of your password. Best of all, Cryptnos remembers all of these options for you, storing them in an encrypted state that is nearly impossible to crack. Your master password is NEVER stored, nor are your generated passwords; your passwords are generated on the fly, as you need them, and cleared from memory once the application closes.
Cryptnos originally sprang from the “Hash Text” function of WinHasher, which I used to generate passwords in a similar fashion for a long time. I quickly ran into limitations in using WinHasher this way, especially when it came to sites where I had to tweak the password after it was generated. I thought to myself, “I’ll never be able to remember all these tweaks for all these passwords. Why can’t I just rip this function out of WinHasher and wrap a program around it to let the computer do all the work for me?” And that’s exactly what I did. I’ve been using Cryptnos to generate and “store” my passwords for months now and I finally decided it was stable enough to release it to the world at large.
Oh, and the name? Um, well, I wanted a better one, but that’s the only thing I could find that sounded “passwordy” that didn’t have a lot of hits on Google.
Wow! A non-Twitter digest post! Amazing!
This is a quickie to let you guys now I’ve just released WinHasher 1.6. This is a minor release containing a few cosmetic and minor functional changes, so there’s no need to upgrade unless the features or bug fixes listed below seem worth the effort.
For those who don’t know, WinHasher is a cryptographic hash generator for Microsoft .NET. It is roughly analogous to digest programs on other platforms (such as “openssl dgst” from OpenSSL) but designed for Windows and other .NET platforms. It lets you verify the integrity of downloads and determine whether changes have been made to files. It does NOT guarantee the authenticity of a file; for that, use cryptographic signatures such those produced by PGP or GnuPG. It also lets you create hashes of arbitrary text, which is handy for generating strong “passwords”, although I’m working on a different project that will do a much better job of this particular task. [Looks around shifty-eyed.]
Just posting a quick note to let you guys know I’ve bumped good ol’ WinHahser to version 1.5. This is both a bug and feature release, so both of you using it will probably want to upgrade. Here’s a quick list of the changes:
System.IO.FileStreamobject uses a 64-bit integer for its
Lengthattribute, meaning this was totally my mistake, not Microsoft’s. The end result here is that WinHasher would crash on files larger than 2GB since it would end up trying to calculate its percent complete on an overflowed negative value. I’ve updated the code so that the single-file length calculations also use 64-bit integers and now I can finally validate that Fedora 11 DVD ISO download. Note that there’s still a hard cap at 8.05EB whether your hashing a single file or you sum up multiple files together. While it’s possible to bump this up to an unsigned 64-bit integer and go for even more ridiculous large numbers, I seriously doubt anyone is going to be running a SHA-1 hash that large any time soon.
MessageBoxobject for this, meaning the hash was displayed in a read-only form that couldn’t be copied and pasted elsewhere to be compared. (It’s much easier to copy and paste two hashes into a text editor, for example, and visually scan the two lines for differences.) Well, I wasn’t the only one to find this annoying. WinHasher user Todd Henry had issues with this too and suggested that I either put the hash result in a text box that could be copied and pasted elsewhere, or add a box where an externally produced hash (say from a Web site) could be pasted into the dialog and have WinHasher compare them. Interestingly enough, I was already planning to make that change when he wrote me, and now it’s there. Once WinHasher is done, it will display a new result dialog with both a copyable hash result field and a new “compare to” field that will take an external hash string and tell you if it matches or not.
I realized after I updated the files and the site that I forgot to make any changes to the documentation to reflect these updates. Oh, well. I don’t think they’re major enough to sweat over, so I’ll leave those alone for now and make sure they get updated by the next release.
So I was listening to this week’s edition of TWiT, during which Leo Laporte and the usual band of miscreants psychoanalyze Microsoft‘s new ad campaign featuring Bill Gates and Jerry Seinfeld. I had not seen the ad yet myself—apparently it debuted during an NFL opening game, and considering that I don’t watch professional sports and the overwhelming majority of my television watching now consists of shows containing magic backpacks and talking monkeys that wear red boots, it hadn’t come to my attention yet—so the discussion naturally raised my morbid curiosity. So I dug around a little on YouTube and found this. I must admit, it’s as surreal as I was led to believe. I won’t attempt to try and mine this thing for hidden meaning like Ryan Block did; the only comment I think I can really make about it is that it tells me absolutely nothing about Microsoft, Windows, or any other product they may have in the pipeline, and after watching it I am no more inclined to pick Microsoft options over the competition than I was before. I thought that was the point of advertising….
But that’s not the weirdest part. Last night, I dreamed about Bill Gates. Maybe it was exhaustion, maybe it was a prescription-drug fueled haze (I’m currently in the middle of my quarterly bout with bronchitis), but it was not something I was particularly expecting. There’s nothing really interesting to say about the dream, though. In what little I remember, Mr. Gates was there, tying his shoes. He wasn’t necessarily trying on new ones, nor was there any indication that the shoes were noticeably old. They were shiny, brown leather dress shoes, so they could have been either new or well maintained. Mr. Seinfeld was nowhere in sight. The setting was unclear; I can’t say that it was a shoe store, a men’s locker room, or any other recognizable setting. I know only that I was seated on a wooden bench which I believe was painted a dark green and that Bill Gates stood next to me, lifted one leg, and set the foot on the bench, then proceeded to tie his shoe laces. Then he left without saying a word and the dream moved on to wherever it went after that. I remember nothing else about the dream, and to my knowledge Mr. Gates appeared nowhere else within it.
I have no desire to do any research on what kind of Fruedian analysis can be drawn from watching a billionare-CEO-turned-philanthropist from one of the world’s largest and most reviled software companies tying his shoes next to me. I’d be afraid of what I’d find. So I’ll just say it was the prescription cough syrup working its magic and go back to talking to the pink elephant and the green roast beef sandwich on either side of me. It’s a conversation about world politics and an economy built entirely around edible golf balls will solve the world’s energy crisis. It’s very enlightening. Maybe, somehow, some way, we’ll figure out exactly what makes Windows “delicious” while we’re at it. Drug-enduced hysteria is about the only way I can think of in my current semi-lucid state to make an operating system taste delicious. It makes me begin to wonder, though… what would other OSes taste like? Would Mac OS be crunchy? Would Linux be spicy? Would my Treo’s PalmOS be light in calories? I certainly hope so… I am trying to lose weight….
Not long ago, I took advantage of a nifty WordPress plugin to enable XML sitemaps for the blog. For those who’ve never heard of XML sitemaps (I hadn’t for quite a while), they are little XML files in a specific format that give search engines like Google hints on how to index your site. They don’t necessarily improve your search rankings per se, but they help the search engine better decide what to index, when it was last updated, relative priorities of different pages, etc. You then throw a special line into your robots.txt file or directly submit the file to the search engine to let it know the file is available. Once the engine knows about it, it will check it periodically to optimize how the site is indexed.
The plugin, of course, makes this ridiculously easy for WordPress. However, GPF gets orders of magnitude higher traffic than the blog does, so finding a way to generate sitemaps there would be ideal. I toyed with the idea for a while until I finally sat down, examined the sitemap specification, and figured out how to roll my own code. It now successfully runs via cron each morning and gives a pretty thorough census of what’s available on the GPF server. The problem is that the GPF site is divided into several parts that are largely autonomous and self-contained:
Ignoring the forum, that left me three major sub-projects for creating sitemaps. It’s easy enough to segregate these into separate files and tie them together using a “sitemap index” file, so that wasn’t a problem. The archive would just be a formatted dump of the archive database, deriving approximate update times from the posting date. The bulk of the rest of the site could be done by stepping through the file structure of the site and taking note of every HTML or PHP file and its last modification time (conveniently ignoring certain files and directories that don’t need to be counted, like access-restricted Premium pages). And that leaves the wiki.
I managed to come up with a decent wiki sitemap routine that I thought I’d share, just in case someone else might be interested. Of course, it’s not likely to be useful for massive wikis like Wikipedia—sitemaps are restricted to 10MB in size and 50,000 URLs—but something small like the GPF Wiki would be easy to submit and index. It was built using MediaWiki 1.12.0; I am uncertain what database changes may be needed for older or newer versions. Here’s my current process:
I only want to index relevant pages, including category pages. The relevant database table for this is “page”. (How… convenient). Unfortunately, this table also contains things like redirects and images. Each image has its own “page” assigned to it; try clicking on an image in Wikipedia or in the GPF Wiki to see what I mean. The time stamp of the latest revision, however, is stored in the “revision” table, joined to the page table by the latest revision ID number. So a good starting bit of SQL would be:
select p.page_title, r.rev_timestamp from page p, revision r where p.page_latest = r.rev_id and p.page_is_redirect = 0 and p.page_title not like '%.gif' and p.page_title not like '%.png' and p.page_title not like '%.jpg';
Unfortunately, this also returns a few meta pages like the sidebar and editing pages. Before selecting, I define a look-up hash of titles I want to avoid and as I loop through the results I just skip those.
The title, of course, is both the displayed title and the input portion of the URL that uniquely identifies the page. Thus, knowing the base URL (
http://www.gpf-comics.com/wiki/) I can easily reconstruct the public URL of any article from the title. As with Wikipedia links, spaces have already been converted to underscores, but the rest of the string needs to be be URL encoded. This is easy enough, so we can quickly build the full URL as required by the XML schema.
The time stamp is a little bit tougher. MediaWiki stores time stamps as a 14-digit number in YYYYMMDDHHMMSS format, always in UTC time. In Perl (in which almost all my crons are coded) this is easy enough to break apart and turn into a UNIX time stamp. I then output the date in W3C ISO 8601 format as required by the schema. A sample of a resulting entry would be:
<url> <loc>http://www.gpf-comics.com/wiki/Nick</loc> <lastmod>2008-08-22T06:00:07Z</lastmod> <changefreq>monthly</changefreq> <priority>0.3</priority> </url>
Change frequency and priority are purely guesses and fudges for mine. According to the sitemap specification, priorities are purely relative to other parts of the site. I rated the wiki pages as relatively low since the wiki at GPF is considered a “supporting” page and subordinate to things like the archive. As for change frequency, the sitemap specification includes a number of predefined choices (hourly, daily, weekly, monthly, etc.). Monthly was a purely off-the-cuff guess; some pages may update more or less frequently, but monthly would be a good average. It is entirely possible to rate select pages as higher priority or frequency than others, but I decided to take the easy route and rate everything the same. To apply different values, you just need to pay special attention to the title and assign a non-default value when that title crops up.
Well, I hope someone out there might find this helpful. I’m not sure if it really helps anyone find anything at GPF, but it was a fun little exercise nonetheless.
For both of you out there who care, WinHasher has now been bumped to version 1.3. The changes are very minor, so there’s no need to upgrade unless you find the following two new features useful:
I had originally started adding support for HMAC signed hashes but have abandoned that for now. If there’s anyone out there who might actually find that useful, drop me a line and I’ll revisit the code to see what I might be able to add. Downloads can be found at the first link above.
The new GPF site has been running live for half a month now, and I’m proud to say things have been running incredibly smoothly. That is, at least, from my perspective; I haven’t seen any major glitches, and aside from a few typos in the comic (which are obviously independent of the site code), nobody has written me about any problems. This is especially heartening because the new site was pretty much entirely coded by hand by me, sans a few bits and pieces. (I can’t take credit for the OS, the web server software, the database engine, or the forum. But everything else… yep, that was me.)
Of course, I can’t really take all the credit. I do have to give some serious props to XCache.
For those unfamiliar with PHP, it is one of many server-side, interpreted scripting languages commonly used for dynamic Web site development. The caveat, however, to any interpreted language is that on each request the source script must be read, parsed, compiled, and executed before anything is set back to the end user’s browser. This is one reason why dynamic sites are and will always be slower than serving purely static HTML files. Static HTML just needs to be read and regurgitated; anything that requires the Web server to actually think takes more time. Add to that the fact that there could be hundreds or even thousands of requests all competing at once for content and it’s a miracle anything get served at all.
XCache is one of several opcode caching extensions for PHP. Essentially, when the first request for a script is made, the script is parsed and compiled as usual. However, XCache stores the compiled code so subsequent requests can skip the parsing and compilation steps and go directly to executing the code. This significantly increases the speed of execution by eliminating one of the costliest parts of the process (except perhaps database connections). In addition, XCache also includes the ability to cache variables and objects, so commonly repeated and expensive variable generation–such as the cryptographic hashes I use for salting cookie hashes or database look-ups for common elements like the Premium subscription levels–can be stored in the cache rather rebuilt on each request.
I was first introduced to XCache by the XCache for WordPress plugin, which was probably mentioned in one of the development feeds built into the WordPress dashboard. I’ve been running this combination here on the blog for a little while with moderate success; I’m still trying to find a good balance of configuration settings to get the best results, but I’ve been happy with the results so far. Without putting much thought into it, I went ahead and installed XCache on the GPF server, hoping that it would help even if I never got a chance to optimize it. Fortunately, it has helped, and now that I’ve optimized the settings it’s exceeded most of my expectations. I’m not sure if there’s something about my code that caches better than WordPress, but GPF has done much better with XCache than the blog has.
Admittedly, I haven’t compared it to any other opcode cachers, nor have I benchmarked it against any of the competition. That said, however, I heartily recommend it to anybody running PHP applications. To get the greatest benefit, you may need to modify some code (or install a plugin if you’re using a prepackaged application) to take advantage of the variable/object caching. But even without modification the opcode caching alone makes for a vast improvement.
I had my first brush with Microsoft Windows Vista this weekend. Like most hard-core geeks who are skeptical of just about anything Microsoft, I’ve read all the hype and negative press and have thus avoided it like the plague. I recently bought a new tablet PC (which just arrived today, woohoo!) and made sure to “downgrade” it to Windows XP. But this weekend as I was performing a Good Samaritan deed I was inadvertently forced to directly interact with Microsoft’s latest and “greatest” OS. And while there’s probably nothing new in this post to anyone who’s used Vista already, I’m sad to report most of what I’d heard and feared are true.
First, a little background. This past week, my sister-in-law’s notebook died. Exactly what happened is still uncertain; we know for certain that the video subsystem is on the fritz, which likely means that something is up with the motherboard (since the video is on-board). The LCD occasionally looks like a black light lava lamp, if that makes any sense, although I was surprised to have it actually work off and on with any given reboot. Windows XP crashes on boot on the NVIDIA video driver, which might (or might not) be consistent with a video hardware problem. Throw into the mix the fact that the system spontaneously reboots or locks up after a indeterminable period of time, sometimes as long as several hours or as short as ten minutes. I pulled out ever trick and tool in my geek arsenal and haven’t been able to completely diagnose the problem, let alone fix it. So now the task has become one of data recovery, and with a creative combination of a Knoppix “live” CD, a USB flash drive, and a USB external hard drive this has gone off without much of a hitch.
Now we introduce the new machine. Like its predecessor, it’s an HP Pavilion “media center” notebook. I put “media center” in quotes because while the old machine actually ran Windows XP Media Center Edition, the new machine runs Vista Home Premium. Other than the OS, it’s obvious both machines are built for one thing: to be a portable home theater system. Both have massive widescreen LCDs, dual huge hard drives, several gigs of RAM, and the latest processors for their time. Needless to say, both machines are meant to be powerful multimedia workhorses and they have the muscle to prove it. Thus, there’s no reason to expect the new machine to be sluggish or slow.
And yet, it occasionally was. HP, like many manufacturers, loads its new machines with tons of useless garbage software. That said, I was surprised to see how little junk was really pre-installed on this thing. So the only thing I can think of that was really bogging it down was Vista itself. I can’t be 100% certain of this as I didn’t take the time to really investigate (most of my time was spent extracting data from the old machine), but there were plenty of times Vista seemed to drag and stutter, sometimes becoming unresponsive for a few seconds.
The culprit, I expect, is the new “Aero” interface. Sure, it looks pretty. I’ll give it that. Compared to XP’s default Crayola-inspired interface (which is one of the first settings I turn off on a new XP machine), it looks slick and modern. But it also seems bulky and bloated. The moment I turned it off and went back to the Windows 95-ish “classic” interface the machine become much more responsive and easier to use. While it was cute watching windows “pop” into existence (something that will probably smell suspiciously like copyright infringement to any Mac OS X user) and the translucent window borders are a nice aesthetic trick, the performance cost is pretty high and not really worth it.
Then there’s the security model. I’d like to applaud Microsoft for finally taking security seriously and making a concerted effort to be responsible with its market dominance by forcing users to be more secure. But boy howdy, is it a bear to work with. Apple has been running attack ads against Vista in their “PC vs. Mac” campaign where the nerdy PC character has to ask permission from a Secret Service inspired man-in-black for every single thing he does. I thought that was funny at the time, but I didn’t really realize how true it was. Having now been trained and used to doing things as an administrator in XP, it’s a real shock to be stopped at every other mouse click with a warning that what I’m about to do has serious security implications. It’s not just a pop-up box, either; the entire screen flashes, dimming everything else and forcing to acknowledge the pop-up before you can continue. Yes, I’m aware of the serious security implications; I’m stepping outside the box and doing advanced things outside what a normal user is likely to do. (For example, moving the contents of the old machine’s “Application Data” and “Local Settings” folders, normally hidden, to their new home.) But do you have to warn me every single blasted time? Really. What’s worse is that this extends beyond some of the obscure, funky guru work I’m currently doing. Simple configuration changes are challenged with the same severity as drastic, devastating, and potentially damaging attacks. Where 95/98 was blatantly promiscuous (or more properly naive) and XP (post SP2) was cautious, Vista is downright paranoid. I half expect it to call in the FBI and the National Guard every time I change my wireless SSID.
Maybe there’s someone out there who can help. If you have experience with Vista and you know how to turn these security pop-up off, just for my login, at least until I’m done doing arcane geek magic to finish restoring this machine, please let me know. I think I’d be done in a fraction of the time if I didn’t have to babysit these prompts all the time. Even if it’s a setting that lets me check a box that says “don’t show this again” so I only get it once per action will be a big help.
After all that complaining, let me mention one thing I did like about Vista: parental controls. As a parent who is faced with a future where my young son will be a few clicks away from all the porn and identity theft of the Internet, I’ve been looking hard at third-party (as well as home grown) filtering and monitoring solutions. Vista apparently has this built in. Unfortunately, I have no idea how effective it is. My guess is that workarounds to bypass it are now just a Google search away. But still, just like XP’s firewall is more of an afterthought than a real security measure, it’s got to be better than nothing, and it will probably be easier to train my non-tech-savvy sister-in-law in how to use it than to explain about proxies and packet filtering. Depending on how long this machine is in my possession, I might try to experiment and see just how effective these parental controls really are.
Again, nothing necessarily new here that you probably haven’t seen everywhere else, but I thought I’d share my experiences to anyone interested in listening. I’m leaning more and more toward ditching Microsoft completely and going with a completely FLOSS setup, and Vista is helping push me in that direction. Then again, I had huge reservations about XP when it came out, too, so who knows what the future will bring?
I just can’t leave well enough alone. I’ve been mildly annoyed with the “hash in progress” and progress dialogs in WinHasher 1.1. The original idea was to use
System.ComponentModel.BackgroundWorker to easily multi-thread very large hashes (say of CD or DVD ISOs or uncompressed video files). This had two benefits: (1) it allows the user to cancel a hash in progress and (2) gives us an opportunity to update the GUI while the hashing takes place in the background, meaning we can inform the user of the progress. Unfortunately, I couldn’t find a method right away to determine the progress of an individual hash.
System.Security.Cryptography.HashAlgorithm.ComputeHash() by default takes a byte array or file stream and chugs the whole thing at once, spitting out the hash as a result. There’s no way with this method to determine how far along you are.
However, if you look at the guts of
ComputeHash(), you’ll find it reads in chunks of bytes into a buffer, then calls two methods:
TransformBlock() for every chunk but the last, and
TransformFinalBlock() to hash the last chunk and finalize the hash. The result can then be obtained from the
HashAlgorithm.Hash property. If we bypass the convenience of the single
ComputeHash() method call, you can read chunks of bytes from the buffer, feed it to the
Transform...() methods, and keep track of how many bytes have been read so far. Since we already know how big the file is from the start (
System.IO.FileStream.Length), it’s trivial to calculate a percentage complete. Want the progress of a multi-file comparison? Sum the lengths of all files in the batch, then keep track of the total number of bytes hashed along the way.
I’ve bumped WinHasher to version 1.2. It should be available on the official site by tomorrow morning.