Bitwise Evolution

Musings of a Portland-area hacker bent on improving digital lifestyles.

StackOverflow: Endorsing(?) Content Theft From Day One

Joel Spolsky and Jeff Atwood just launched the public beta of Stackoverflow today, with the intent of building a community for high-quality technical questions and answers. I’ve been using the site for about three weeks now, during the closed beta, and I’ve noticed a disturbing trend that was outlined in Joel’s announcement post today:

Want to know an easy way to earn reputation? Find a question somewhere with several good, but incomplete, answers. Steal all the answers and write one long, complete, detailed answer which is better than the incomplete ones.

The site presents an interface where much of the functionality is hidden from new users. You can’t comment on posts, for example, until you’ve earned 50 “rep”. Voting up takes 15 rep, voting down takes 100 rep, and each downvote you place will cost you one rep. You gain rep by posting questions and answers that other users vote up, or accept. The result is an addictive system that, in theory, prevents “Griefing” (the system does NOT prevent griefing, by the way. It is extremely easy to game.)

Because of this, it is tempting to re-post successful content from other sources, and nothing the creators (Atwood and Spolsky) have incorporated into the site, or the recent announcements, has indicated that this is objectionable. Afterall, good content on Stackoverflow will improve their service, regardless of where it came from, and regardless of whether it is properly credited.

After using stackoverflow for a couple weeks, I think that they have created a useful service, but I also want to call them out for providing an environment that is encouraging plagiarism. Duplication/copying of content within stackoverflow does not set easy with me, but I’m willing to accept that the content I create for stackoverflow is public domain, and is free to be copied at will. However these posts are not.

Breaking Away From Visio

The ‘proper’ way to do user interface design is hotly contested in the OSS software development world, and the discussions usually boil down to three suggestions:

  1. “Just write it – it’s not that hard”
  2. “Use [glade|qt designer|netbeans|…] – all the widgets are there”
  3. “Just use pencil/pen/whiteboard/etc – it’s faster”

I don’t agree with any of these – (1) is completely unreasonable. Some people may be able to hack out a UI in their favorite language quickly, but when you suddenly need to move half the UI into a new dialog, or out of a dialog and into the main window, and change the tabs into a check list, with sample data, you’re screwed. Once you finish manhandling your code to account for that change, you’ll need to add a component that is shaped like a septahedron with 7 distinct clickable areas and a tooltip that includes the latest stock quotes, and that tooltip needs to be scrollable. The widget set places unreasonable restrictions on the design phase of your project.

Option (2) suffers from the same issues as (1) – you’re constrained by the widgets available, although this is less of an issue because it’s generally easier to add images, and the images can look like the widgets you’re missing. However, since GUI UI development tools are, well, for development, they require you to do lots of irrelevant (at this stage) tasks, like specifying how objects will behave when they’re resized, and dealing with layout managers.

(3) is the closest fit for my needs, and I do do lots of paper/whiteboard prototyping, but eventually, I need to show something that looks real, and sketches don’t cut it. There simply isn’t enough resolution there to convey everything that needs to be included in the mock-ups I create.

Hm.. perhaps I should go into what I do need, since my needs may be pretty esoteric. If you’ve made it this far, you’re probably seething with anger, or you’ve got some idea of where I’m coming from. I’m frequently in need of electronic versions of UI prototypes for remote collaboration, “wizard of oz” testing, or for inclusion in presentations and reports. These mock-ups need to look “real” or the is a substantial risk of biasing any experiments, and there is an expectation of polish that can’t be reached with hand-drawn interfaces. Since a lot of what I create is to solve novel problems in (at times) esoteric domains, we often need to use a mix of existing and novel widgets.

Generally, we use Visio to create these interfaces. It offers a good balance between vector drawing capabilities and shape templates for common UI widgets / forms / etc. You are also able to import images, which is fairly critical when updating or adding to the UI of an existing tool. (It’s easy to take a screenshot, clear out the details with the Gimp, and import as a background layer in Visio.)

Unfortunately, I’ve been unable to find any OSS tools that can fill this niche as well as visio. There are a few, as recent posts from slashdot and the old Joel on Software forums show:

  • DENIM: Lets you sketch out interfaces with a tablet / mouse and create navigable web sites from those sketches. Lacks in the “polish” area.
  • Pencil: Firefox Plugin. Peformance has been poor, in my experience. There are very few widgets (currently) available, and no image import capabilities (this is a huge flaw, IMHO). Pencil could turn into something great, though.
  • DIA: Last release was in March, 2007, but the svn repo does show some activity. DIA lets you create things like network diagrams, UML, and flow charts, much like Visio, however, there are no UI stencils. Instructions for creating new stencils (‘shapes’) exist, but the svg support for shapes is very limited (no gradients, no rounded rectangles, etc..) and the documentation is even worse.
  • Kivio: Much like dia, with essentially the same failings.
  • QT Designer | Glade | etc.: see above comments about GUI development tools.
  • Inkscape: Nominally a vector drawing tool, much like Adobe Illustrator, Inkscape has an active community, good documentation, and it is quite stable. Unfortunately, it is not possible to customize the pallets / shapes available, and there is not much community support to make it a good UI design tool (aside from what can be done with any vector drawing app of this quality).
  • Yahoo! UI Stencils (YUI): Not really a tool, but rather a collection of svg images of common interface widgets.

None of these, on their own, do the job. However, with nothing else looking bright, I’ve been digging into Inkscape more over the last few days, and I think I’ve figured out a workflow that will do.

First off, the YUI stencils are critical – but they are not in a format that can be easily imported and used as “widgets”. Ideally, Inkscape would let me define custom shapes, complete with constraints on the sub-components of those shapes to influence resize and translation behaviors, but that isn’t yet available (to my knowledge). You can get around this, somewhat, by using the open dialog as a pallet of sorts:

“If you have a number of small SVG files whose contents you often reuse in other documents, you can conveniently use the Open dialog as a palette. Add the directory with your SVG sources into the bookmarks list so you can open it quickly. Then browse that directory looking at the previews. Once you found the file you need, simply drag it to the canvas and it will be imported into your current document.” (From the *Tips and Tricks* tutorial in Inkscape)

This would work reasonably well, if the open dialog were not modal! (Ranting about modal dialog is another post, or two, at least.) Thankfully, you can drag from the dialog into an inkscape instance even if they are running on different processes :). Therefore, you can start up two inkscape processes (NOT via the “new document” option on the toolbar or file menu – you have to actually start up two instances separately or the dialog’s modality will still interfere with your work). Once you have the processes going, and two inkscape windows, open the open dialog on one of them, go to the directory with your widgets, minimize the (now useless) inkscape window you opened the dialog from, and rock on with the YUI stencils & whatever other tools you need to hack out your UI in the other inkscape instance.

There are a couple of things to keep in mind:

  • Inkscape supports layers, so you can create stub data in a separate layer from the UI structure, and set the background content in another layer, etc.. so you don’t have to worry (as much) about grabbing the wrong thing and moving it out of place.
  • The drag-and-drop action from the open dialog will include everything in the dragged svg file – so the YUI stencils (or any custom shapes you make) need to be broken out into separate files. (I’ve done this for some of the components, and you can download those files here: (Broken out YUI stencils). They are released under a Creative Commons Attribution 2.5 License.

Pencil (or one of the other options) may work better for you – many people have complained that their clients think an app is nearly finished because the UI looks “real”, and there are numerous ways to address that. (eg: NapkinLAF for Swing apps.) I haven’t had this problem, and something like NapkinLAF doesn’t address the problems I have, which are all related to pre-coding UI design.

Wrestling Python

With the launch of the StackOverflow beta I posed a question about python static analysis tools, as I have been playing with python and django recently for some side projects. The responses at Stack Overflow quickly pointed to PyChecker, PyFlakes and PyLint.

Over all, it was a disappointing experience. My experiences are outlined below, and they (more or less) reflect this more extensive review by Doug Hellman.

Here are my first impressions of pyflakes, pychecker and pylint:

  • pychecker: It crashes frequently, most of the runs I tried resulted in Errors that originated in the pychecker code (eg: AttributeError or IndexError: list index out of range were the most common). For some reason I had to set the DJANGO_SETTINGS_MODULE environment variable before it would even run on any of the app code, and the documentation is very sparse.

  • pyflakes: ‘pyflakes –help’ throws a TypeError – erm… Documentation is also very sparse, and pyflakes is very forgiving (as far as I can tell, it only reports compile errors, warnings, redefinitions, and some concerns about imports–such as unused and wildcards). pyflakes also seems to repeat itself:

    eventlist/views.py:4: ‘Http404’ imported but unused

    eventlist/views.py:4: 'Http404' imported but unused<br>
    eventlist/views.py:5: 'from eventlist.models import *' used; unable to detect undefined names
    eventlist/views.py:59: 'authenticate' imported but unused<br>
    eventlist/views.py:61: redefinition of unused 'login' from
    

    line 59

    eventlist/views.py:5: 'from eventlist.models import *' used;
    

    unable to detect undefined names
    eventlist/views.py:4: ‘Http404’ imported but unused

  • pylint: This seems to be the most capable of the tools suggested. It has the best documentation. LogiLab provides a tutorial, pylint has a help screen, and there is a (broken) link to a user manual, which would be extremely helpful. There are some issues with applying pylint to django, since pylint doesn’t know about the django classes (such as models.Model). This means that a fair number of otherwise valuable errors are generated about missing class fields. eg:

    E:105: get_events_by_tag: Class 'Tag' has no 'objects' member

Parsing these out automatically will be very difficult without some additional knowledge of the classes in use. I’m not sure adding that is feasible, but it does seem likely that pylint is capable of dealing with this in the “right” way. (I probably just need to point it to the django source, but there are no command line params that look likely, and, as mentioned earlier, the user manual is inaccessible.)

For the moment, I’m still looking into pylint – pychecker and pyflakes need better documentation and they need to become more robust.

Traveling to Patras

Coffe and clouds in Seattle.

A groggy morning in Seattle started with the typical regional sunshine forcing its presence through heavy cloud cover–the first overcast day in nearly a week of clear, scorching weather.

The hills across the Gulf of Patras (looking North)

Seattle to Newark, hustle off the plane, bad coffee, hustle to the next gate, and then encamp for the next 9 hours of travel across the Atlantic. This leg was on a relatively empty 767, but I still can’t really sleep on planes. I have become an overnight advocate for noise-canceling headphones though–they make the difference between muffling a sound and turning it off. I found the Athens airport to be considerably less hyper than most US airports (with the exception of Syracuse, which is not much of a surprise), although I think it is universal that no one is particularly happy when at an airport, and that feeling seems to be contagious. It’s unfortunate that it is so difficult to visit a place without first visiting an airport there.

The morning view from the balcony

I few hours later I hopped on a tour bus to take everyone from the recent batch of flights off to our hotels in Patras–a wild 4-5 hour ride along the shoulders of “one of the worst roads in Greece” as some one here put it. The careening along coastal roads was punctuated by hairpin turns in seaside villages with roads small enough to make any driver feel claustrophobic, and we were riding in a bus that would seat 30-40, with luggage. It was much like watching a game of Fifteen Puzzle, with a key twist: every tile (car) was self-aware and initially greedy.

We did eventually make it to the Hotel Tzaki almost exactly 24 hours after leaving Seattle.

Cracking Down on Application Clutter (or: My ${HOME} Is My Castle!)

There was once a time when your home directory was treated as a nearly sacred place, a safe haven where you had near complete control. This trust was only breached for very special reasons: user specific settings and background storage for applications could go in “dot-files”–the hidden files or directories that begin with a “.” and therefore don’t show up in normal directory listings.

Unfortunately, things began to change. I don’t know what kicked it off, but soon there was a Desktop (or desktop) folder. It was glaring–many XFree86 window managers don’t even have the concept of a desktop, but the defaults environments were (and still are) often set to Desktop Managers. Web browsers took after the DM’s, and soon we all had these glaring “Desktop” directories hanging out whether we wanted them or not. I’ve managed to tolerate this infraction for years, and aside from the occasional frustration (eg: Ecilpse and NetBeans, with their request for a ~/workspace and ~/NetBeansProjects directories).

However, today things changed.

$ ls ~ bin/ PDF/
Desktop/ Pictures/
development/ Public/
documents/ src/
Documents/ shared/
downloads/ Templates/
Mail/ Videos/
Music/ virtualMachines/ myapps/ workspace/

Documents? (Ok, I can sort of understand that one.) Music? Pictures? Templates, PDF, Public, and Videos?! Did I suddenly become a master of multimedia? Keep in mind here, I’m a java hacker on a Linux box–this isn’t exactly a fine-tuned rendering/desktop publishing platform. And of course, every one of those directories is empty. Thankfully, I checked before deleting documents vs. Documents (I’ve been bitten there before–on a mac due to case conflation–but that’s another story).

Why would I want a directory called PDF? I can understand (possibly) wanting to tag files with “PDF”, but as part of a single-dimensional sorting criteria? (Hey, lets store all my .h files in ~/H/ and all my .cpp files in ~/CPP/! It’ll be great!)

Needless to say, I’ve removed the offending directories, and this time I’m ready:

$ kernel-filesystem-monitor-daemon-cat -v watch ${HOME} | perl -ne '{ if( /CREATE/ ) { # only report create events

  s|.*URL:\./||g; 
  if ( !/^\./ ) { # don't report new dot-files
     print "$_ created @ "; 
     print `date`; 
  }

} }’ > ~/whenCrapWentDown.txt (newlines and comments introduced to improve clarity – if you’re pasting this into a shell, you’ll need to either add 's or remove newlines.)

KFSMD (kernel-filesystem-monitor-daemon) is an app that does exactly what it’s 32-character name says. Whenever a filesystem change occurs, it knows about it. The -cat part just tells it to print to stdout, and the hunk of perl does some minor processing, and introduces time stamps.

I’m actually running this in a sticky terminal that’s pinned to my E17 desktop, so if/when something starts building an empire in my home directory, I’ll be able to compare with what apps are running, and hopefully track it down. (It would be nice to collect the PIDs of the process that actually issued the system call to touch the file system, but this is good enough for now.)

fsWatcher.png

Now we wait…

(This article got me going with KFSMD.)

Creating Wizards in Java

A recent project at work required building a multi-step dialog to manage the interface between a user and an expert system (and some fairly advanced NLP to boot). On the surface this looked like a fairly standard Wizard problem – design a bunch of screens with questions, and then collect the answers as the user proceeded through the dialogs. However, the Wizard APIs I found were either not very mature or (in the case of the Java.net Wizards) it was very difficult to create complex branching behaviors, and those branches were extremely resistant to change. Both things are essentially show-stoppers when working with prototypes that frequently need modification.

In the end, I spent a weekend and a couple evenings building a new Wizard API for Java, called CJWizard. The library is released under the Apache V.2 license, so it should work for just about anything you want to use it for. I would like to know if you’re using it, and what you’re using it for, just to sate my own curiosity :). The project is hosted on code.google.com, so please submit issues, and feel free to contribute to the project.

CJWizard provides the structure needed to quickly create simple dialogs by implementing an abstract class (WizardPage) for each page of the dialog, and adding them to a PageFactory that can generate pages on-demand, as they are required. This puts the programmer in full control of how the wizard proceeds. The CJWizard architecture also makes it easy to add a wizard to an existing application (either via an additional JDialog, or embedding in some other component), and/or insert custom wrapper widgets around the dialog pages–meaning that you can quickly add customized navigational controls aside from the standard Previous/Next/Finish/Cancel buttons.

Some aspects were taken from the Java.Net wizard API, such as auto-detecting named components, and automatically collecting the values from them, but CJWizard takes a much simpler approach (and in some ways, a less powerful one – CJWizard does not listen to every key event, only collecting values when the user navigates away from a WizardPage). In most cases, you only need to name widgets prior to adding them to the WizardPage, and their values will be collected in a settings map automatically.

CJWizard was meant to provide a flexible way to generate professional-looking multi-step dialogs very quickly.

Day to Day Memoization

Memoization (not memorization) is the process of remembering the results of a computation for use later. (I think of it as “making a memo” to look back on later.) Memoization is the core to any dynamic programming implementation, and allows many simple algorithms to run in linear or polynomial time when they would otherwise take an exponential number of operations to complete. This is most obvious in the typical recursive Fibonacci example. Consider the code:

[cc lang=”java”] public class Fib{ public static void main(String[] args){

  System.out.println("done: fib of "+args[0]+"="+
  fib(Integer.parseInt(args[0])));

}

public static int fib(int n){

  int rval = 1;
  if (n &gt;= 2){
     rval = fib(n - 1) + fib(n - 2);
  }
  System.out.println("fib("+n+") = "+rval);
  return rval;

} }[/cc] This is a straight-forward recursive implementation of fib. When run with n=4, we see this: [cc lang=”bash”] $ javac Fib.java && java Fib 4 fib(1) = 1 fib(0) = 1 fib(2) = 2 fib(1) = 1 fib(3) = 3 fib(1) = 1 fib(0) = 1 fib(2) = 2 fib(4) = 5 done: fib of 4=5 [/cc]

9 invocations of fib(n), but only 5 unique invocations. Lets memoize the results, and try this again:

[cc lang=”bash”] $ javac Fib.java && java Fib 4 fib(1) = 1 fib(0) = 1 fib(2) = 2 fib(3) = 3 fib(4) = 5 done: fib of 4=5 [/cc]

Much better – 5 invocations, 5 unique sets of parameters.

Here’s the source with memoization: [cc lang=”java”] public class Fib{ static Map<Integer, Integer> memos = new HashMap(); // new

public static void main(String[] args){

  System.out.println("done: fib of "+args[0]+"="+
  fib(Integer.parseInt(args[0])));

}

public static int fib(int n){

  if (memos.containsKey(n)) // new
     return memos.get(n);  // new

  int rval = 1;
  if (n &gt;= 2) {
     rval = fib(n - 1) + fib(n - 2);
  }
  System.out.println("fib("+n+") = "+rval);
  memos.put(n, rval);       // new
  return rval;

} }[/cc] Notice that we only needed to add 4 new lines of code in order to memoize the results. When fib(n) is called, it simply checks to see if it has previously been called with n, and if so, that result is used again. If the parameter has never been seen before, the method continues as normal, storing the computed result before returning. Memoization turns this naive (and exponential) implementation of fib(n) into an efficient (linear) operation.

Memoization in the real world

So, (un?)fortunately we don’t spend all day implementing cool new ways of computing ever increasing entries of the fibinocci sequence – how can memoization be put to use? After all, many algorithms are already implemented in some fairly optimal fashion by the language APIs, and you’d be a fool not to use those implementations. What opportunity will you have to memoize functions?

It turns out that you can memoize anything, as long as the function is pure with respect to the memos (meaning: the function doesn’t depend on any thing that is not used to key the hash of memos). If the function is not pure, then you can still use memoization, but either the memo hash must key on all the state and parameters that can affect the results of the function. On the other hand, if f depends on some state that changes very rarely, then it may make more sense to simply discard all the stored memos each time that aspect of state is altered.

Memoization is extremely handy when you have very common operations that are fairly expensive. I recently needed to optimize some code that compares strings based on the case-insensitive stems of the words, with stopwords removed. So the strings “he wanted an apple” and “he wants apples” should be equal. (“an” is a stopword, and ignored)

This meant doing many, many calls to a string stemmer, each of which is a fairly expensive operation. Fortunately, hashing strings as extremely cheap (on the order of 1/4th the time it took to stem a string of the same length), and I had plenty of memory to store the parameters and the results in a Map. Adding memos to the two primary time-hoggers (the stemmer and a tokenizer) cut the execution time of the application down from 2 hours to just over 7 minutes.

Summary

You can memoize any function that only depends on it’s parameters and constant state (or near-constant state – just don’t forget to discard your memos when the state changes!). If the function is invoked multiple times you will probably see a performance improvement.

If you need to memoize a function with multiple arguments, then you just need to nest Maps, or create a unique key by combining the parameters in some way.

Memoization is an extremely easy way to improve performance under certain circumstances, particularly if you have a solid grasp on when state changes outside of your methods / functions, or program in a functional style. It can be memory intensive, however. If the results of your functions are large, or maintain references to large objects, then memoization may penalize performance if you run out of memory and have to make use of swap space.

Creating a Secure Webauth System: Part 1 – HMAC

This is the first in an n-part series about web authentication for a system where user identification and attribution is important, but content protection is not. This entry assumes that a secure method has been used to negotiate a shared secret – as the result of username / password authentication over https, for example.

Obviously the user login / account registration portion of a web auth system will require some secure connection, but once that authentication is completed we’d like to make use of a more efficient open protocol. (eg: http vs. https). There are many reasons for this: better performance, client-side caching, etc.. I’m not going into those details here. Neither will I address the initial authentication step other than to say that part of a successful login is the negotiation of a shared secret other than the user’s password. Ideally this is a 64-byte (or larger) id with a high probability of uniqueness. A GUID, essentially. (It is critical that the secret used is NOT the user’s password!) The actual secured login and secret negotiation will be addressed in another entry . At least, that’s the plan :).

Since our primary goal with this system is to ensure that people are who they say they are, and we’ve punted on the initial authentication (for now), the only place left for an attack is for some one to spoof a user who has already logged in. With no additional work, our login system would be useless – some one could simply skip the entire authentication process and issue an RPC with instructions to do evil things as Alice’s user without needing to know Alice’s password. To prevent this, we need to ensure that the same user who authenticated initially is the user who issued the unsecured RPC. This is where the shared secret comes into play. Only Alice and the server know her shared secret, so if the secret is passed along as a parameter of the RPC, then that is a strong indication that Alice is who she says she is.

But wait! We can’t just pass the secret as an RPC parameter, because these communications aren’t secure. Charlie could lurk on the ‘net, waiting for an RPC from Alice to the server, sniff out the secret, and then proceed to impersonate Alice. We could encrypt the secret, but then we just have a different secret – Charlie doesn’t need to know the unencrypted secret if the encrypted one works just as well. Alice and the server also need an agreed upon way to change the secret so it is different for each RPC, and this must be done in a way that Charlie can’t take the changed secret and either (1) get the initial secret out, or (2) generate the next changed secret.

HMAC: keyed-Hash Message Authentication Code

HMAC is a method of ensuring that a message (an RPC in our case) was generated by someone with access to a shared secret. HMAC makes use of some sort of one-way hashing function (like MD5 or SHA-1) to encrypt the secret along with a message. This generates a short digest of 16-20 bytes that acts as a fingerprint of the message+secret combination. When the digest is sent along with the message, the receiver (our server) can re-generate the hash with the same HMAC calculation and compare the locally generated digest with the digest that came along with the message. Remember: the server has the secret too, so it has enough information to confirm the digest.

So, back to our problem. Alice can now use the shared secret to create a digest of every RPC, and send that along with the RPC as a parameter. The server can then take the digest of the RPC and secret to compare, and then verify that the RPC actually originated with Alice, right?

Not quite yet…. there are still a couple holes in our plan.

Charlie could still sit back and snoop on Alice’s traffic and save an entire RPC, complete with digest, and reissue that RPC later. This is better than letting Charlie do whatever he wants, but there are still some things that could be quite dangerous. Say Alice accidentally deletes something, and undoes the deletion. Charlie could re-issue the deletion and Alice would loose data. The server needs to know not to accept the same request twice (but what if Alice wants to do something twice, you ask? Well, we have to make Alice’s second request a little bit different from the first one, which we can do!).

What if we create a digest of some sequence identifier and pass the sequence ID along with the RPC? Since the digest, ID, and RPC are inseparable (the digest and ID are obviously linked, and the RPC can’t be sent without a valid digest, which Charlie can’t reproduce, since the accompanied digest+id pair is only good once) then we don’t need to create a digest of the entire RPC (it wouldn’t hurt, it’s just a fairly complex thing to do). By incrementing the sequence id and recalculating a digest of it and the secret, then we can keep from issuing the same request more than once, and the server will know to ignore duplicates.

So, this is where we’re at:

  • Charlie can’t snoop the secret, since it’s encrypted with a changing message (the sequence id)
  • Charlie can’t re-issue a “recorded” RPC invocation, because the digest can’t be reversed and Charlie can’t create a valid (digest, sequence) pair without the secret.
  • Charlie can’t change a RPC, again because of the trouble with creating a digest, sequence pair.

Charlie’s only recourse is to try and find a secret which generates the same digests as the secret that Alice is using. This is theoretically possible, since Charlie could probably figure out the hashing algorithm used, and run a brute-force attack, hoping to luck out and find the secret quickly. The possibility of this happening is extremely low, however. Furthermore, each session will use a new secret, so Charlie will only have one session’s worth of time to crack each secret. Even creating a rainbow table will fail if the secrets aren’t of trivial length. (A 64-bit secret will be to large, and we’re using secrets 8 times that size.)

Technical details and further reading

When implementing an approach like this, make sure to guard the secret. It would be easy to accidentally store the secret on the web client as a plain cookie which will then be transmitted in the clear with each RPC, and therefore defeat the purpose. Use a secure cookie, or some other storage method to prevent this.

The HMAC RFC describes the algorithm in detail (and it’s a fairly short, easy to read RFC.) and the Wikipedia page gives a nice description too: HMAC on WikiPedia

Things I Need

There are many small apps that I wish I had, here’s a short list of the ones that come to mind at the moment:

A process monitor that shows the top consumer.

I often tack my system(s) to the max, and therefore run out of cycles frequently. While this is sometimes the result of batch computations that I’ve planed in advance, it is pretty common that I’ll just be working away and all of a sudden things shudder to a halt. When this happens I want to know two things: 1. Is it processor-related, or is it memory-related, and; 2. What application is responsible? Processor / memory monitors are a dime a dozen, but they are typically very small (showing only the usage, like gkrellm) or very large (showing all the applications in the top 20 or so, like top). I can’t stand having top visible all the time, and it takes to long to get to a terminal and start up a monitor. By definition my system is not very responsive, and I never see what’s causing the slowdown.

I need a small process and/or memory monitor that shows the top-using application in a tooltip, or optionally in an automatic pop-up when the usage hits a certain level.

Universal acronym definitions.

Highlight an acronym, hit a keystroke, and see a list of the most common expansions of that acronym based on frequency of use.

A calendar dock-app where the date on the dock icon is actually accurate.

Yeah, I actually want to look at my system bar-thingy and see what day it is, not some random number between 1 and 31 that the icon developer thought represented calendars.

Hovering over the icon shows the full time/date, which is configurable from an entry on the icon’s context menu. I don’t care what happens when I click on the icon, as long as I can make it do something arbitrary ;).

The ability to refactor and generate source code from the command line.

MetaJava (http://wiki.ciscavate.org/index.php/MetaJava) could resolve this issue for one language, but I’d hate to stop there. A tool like this would mean amazing things for small-time development environments and text-editor lovers. (Emacs and vi would easily eclipse some other IDEs IMHO ;).

The idea is that you could easily create mini applications that read in specifications in some simple format and produce boilerplate for your required language, and/or move classes, rename variables / methods / packages / etc. without dedicating half your memory to an IDE that will then want to write 500 mb to swap, since all my memory is taken up by an application I have to run because….

I don’t have a web browser that doesn’t suck.

“We hold these truths to be self-evident.”

…and a web development environment to go with it.

That’s a start. I’ll add more as they occur to me.

(Not (Fill-paragraph))

I use emacs as much as possible, today being no exception. Currently, I’m doing a fair bit of writing at work, and unfortunately that means Word (or Open Office at best, depending on what OS I’m in). Neither program supports much in the way of emacs compatibility modes, so if I’m generating new content (as opposed to editing an existing doc.) I tend to write in Emacs, and paste into Word when I’m finished. This works pretty well, considering.

There is one very annoying issue though: when in emacs, I use auto-fill-mode to keep the content on screen as I type. The problem is that auto-fill-mode breaks lines with literal newline characters, while the word wrapping in Word / OpenOffice just wraps the content without additional characters. As a result each line ends up as its own paragraph when I paste content from emacs into Word. The solution, of course, is to extend emacs with a simple function to undo auto-fill.

Merging lines

The first problem, as I saw it, was to find a function that would merge two adjacent lines, and leave them separated by a single space. Unfortunately, such a thing doesn’t seem to exist. No problem, we need to go to the end of the current line (end-of-line), search backwards for the first non-whitespace character ([^ \t]), erase the rest of the line, including the end line (kill-line), and insert a space ((insert " ")).

    (defun mergelines(&optional backward)
      "Merges the following line with this line, or merges this line
      with the previous line if a prefix argument is provided.
      Removes any whitespace between lines, replacing it with a
      single space."
      (interactive "P")
      (if backward 
          (previous-line))
      (end-of-line)
      (re-search-backward "[^ \t]")
      (forward-char)
      (kill-line)
      (insert " "))

To make it more useful, I added a parameter that determines if the line below, or line above should be merged. This made the rest of the unfill function much easier to write.

Un-filling a region

Now that we can merge lines, lets address the problem of unfilling a bunch of lines. Since I know there is a function mark-paragraph already, lets just deal with arbitrary regions for now.

    (defun unfill-region(rstart rend)
      (interactive "r")
      ;; get to the end of the region:
      (goto-char rend) 
      ;; if the region ends on the first char. of a line, move up a line.
      ;; this makes it easier to select a paragraph and apply the function.
      (if (= (point) (line-beginning-position))
          (previous-line))
       ;; loop while the point isn't on the starting line:
      (while (not (= (line-number-at-pos (point))
             (line-number-at-pos rstart)))
        ;; merge with previous line.
        (mergelines t)))

I’ve tried to comment well, so it should be relatively straight forward, but here’s an overview of the algorithm anyway:

  1. (interactive "r") just means that the current region’s start and end locations are stored in the parameters rstart and rend.
  2. We need to merge from the bottom up, because if we merge from the top down we need to keep track of the lines merged, and things generally become more complex (we might end up merging to many lines if we loose count.). Because of this, we first move to the end of the region.
  3. Since you (well, I) generally select from the first column, and move one line past the last line I need (try it if you don’t understand what I mean), I needed a special case to keep from merging the empty line between paragraphs.
  4. Now, we merge each line with the line above, which moves the point up a line too. When the point is on the same line as the start of the region, we stop.

I haven’t merged it with mark-paragraph yet, but it would be trivial to do so. More importantly, I want to make it skip blank lines, so it will be possible to mark an entire document, and call unfill-region (and therefore, write unfill-buffer). As it is now, if you do that the entire document ends up on one line, which is not usually ideal :)