From scs@eskimo.com Thu Feb 13 10:48:44 EST 1997 Article: 228903 of comp.lang.c Newsgroups: comp.lang.c,comp.software-eng,misc.int-Property,misc.legal.computing Path: newshost.uwo.ca!torn!howland.erols.net!news.mathworks.com!mvb.saic.com!eskimo!scs From: scs@eskimo.com (Steve Summit) Subject: Re: PATNEWS: IBM patents base-B "printf" X-Nntp-Posting-Host: eskimo.com Message-ID: Followup-To: comp.lang.c X-Scs-References: <199702030135.RAA06386@elf.bsdi.com> <32F616E6.218@pact.srf.ac.uk> Sender: news@eskimo.com (News User Id) Organization: schmorganization References: <32f07b4a.0@news.man.szczecin.pl> <32f8684c.0@news.man.szczecin.pl> Date: Thu, 6 Feb 1997 17:46:25 GMT Lines: 122 Xref: newshost.uwo.ca comp.lang.c:228903 comp.software-eng:52084 misc.legal.computing:28275 [followups redirected to comp.lang.c only] In article <32f8684c.0@news.man.szczecin.pl>, slawek@arcadia.tuniv.szczecin.pl (Slawomir Marczynski) writes: > Steve Summit (scs@eskimo.com) wrote: > : The (ever so) novel aspect is that when the base is a power of > : two, a significantly more efficient implementation can be > : achieved, using shifting and masking. A general-purpose > : implementation (i.e. for base 10 or any base), using division and > : modulus operations, is significantly slower. > > But nobody should assume that division and modulus operations will > not be optimized by for example an optimizing C++ compiler! Please > check how modern compilers translate this piece of code: > A = 2 * A; > On most machines/systems it will have been executed like: > SAL AX,1 Oh, believe me, I know. (No unliteracy here.) In fact, I'm usually the one making this argument, when the subject of shifting vs. dividing comes up. However, those optimizations can in general only be performed by the compiler when the shift/divisor is a compile-time constant. In the case of printf, the code for %d, %o, and %x (and, if you make the extension, %b) is common enough and complicated enough that it's reasonable to try to coalesce it. For example, here's a simplified rendition of my own version of printf. (Please pardon the gotos; I do believe they're justified.) negflag = FALSE; digs = "0123456789abcdef"; switch(*fmt) { case 'b': base = 2; u = va_arg(argp, unsigned); goto donum; case 'd': case 'i': base = 10; n = va_arg(argp, int); if(n >= 0) { u = n; } else { u = -n; negflag = TRUE; } goto donum; case 'o': base = 8; u = va_arg(argp, unsigned); goto donum; case 'X': digs = "0123456789ABCDEF"; case 'x': base = 16; u = va_arg(argp, unsigned); donum: p = &buf[MAXBUF - 1]; do { *p-- = digs[u % base]; u /= base; } while(u != 0); if(negflag) *p-- = '-'; /* now print string pointed to by p+1, */ /* with padding, etc. */ Applied to printf, then, the shift-and-mask (instead of divide- and-remainder) optimization for %b, %o, and %x looks like this: switch(*fmt) { case 'b': shift = 1; mask = 1; goto dobnum; case 'o': shift = 3; mask = 7; goto dobnum; case 'x': shift = 4; mask = 0x0f; dobnum: u = va_arg(argp, unsigned); p = &buf[MAXBUF - 1]; do { *p-- = digs[u & mask]; u >>= shift; } while(u != 0); I've finally gone and laid this optimization into my own printf code, and timed it; preliminary indications are that it can run over 40% faster, for a sprintf-heavy test case. (But to reiterate a claim I made in my previous article in this thread, I don't claim to be the first to have thought of this optimization, as applied to printf. Chris Torek sent me the printf code from 4.4BSD-Lite, dated 93/06/02, which uses this optimization, and he indicated that the previous version of the BSD code, dated 91/01/20, "already used shift and mask for base 8 (%o) and 16 (%x) conversions." [It had probably been using it for much longer than that.] Nathan Sidwell has told me, and posted to these newsgroups, that he used the optimization in published form in 1989 or 1990. At any rate, IBM seems to be claiming their patent based on floating-point conversions, not integer. Now, "hex float" is an idea which I've also been thinking about; I have to reread the patent to see if my idea isn't the same as theirs.) > Lets think... An programer use A = 2 * A, compiler optimize it > to A = A + A, IBM has got patent on A = A << 2, compiler optimize > the patented code to A = A + A - what a mess! (BTW 2*2 == 7, see > S. Lem novel about Trul and Clapucius and their computer.) I haven't read any of those stories in a long time, but you're right, they're gems. Steve Summit scs@eskimo.com