Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Certainly Not Logic

By czth in Op-Ed
Thu Aug 14, 2003 at 02:52:47 AM EST
Tags: Technology (all tags)
Technology

"Logic!" said the Professor half to himself. "Why don't they teach logic at these schools?"

"Bless me, what do they teach them at these schools?"

- The Professor, The Lion, the Witch, and the Wardrobe, C. S. Lewis, 1950.

Wherein we look at various code samples and wonder how these people learned to code, and determine what lessons we can learn from their ineptitude.


ADVERTISEMENT
Sponsor: rusty
This space intentionally left blank
...because it's waiting for your ad. So why are you still reading this? Come on, get going. Read the story, and then get an ad. Alright stop it. I'm not going to say anything else. Now you're just being silly. STOP LOOKING AT ME! I'm done!
comments (24)
active | buy ad
ADVERTISEMENT
Many of the code excerpts shown here will be seen to have a similar purpose; that is because they were written to fulfil this programming task as a "pre-screener" for applying for a development position. The names of the guilty have been withheld and some identifying characteristics of the code changed.

All code shown is real code written by real people - and thus more valuable for demonstration (and entertainment) than abstract exhortation.

Documentation

Comments are an important aspect of code; they're the top-level road map and help future maintainers map what's done to where to find it in the code. Code style and variable naming is also an important part of documentation.

#####################################################################
##                                                                 ##
##                    Object's Accessor Methods                    ##
##                                                                 ##
#####################################################################

#______________________________METHOD_______________________________#
#                                                                   #
#                get_foo_value( BAR_NAME, BAZ_NAME )                #
#___________________________________________________________________#

... some (sane) perldoc ...

#__________________________METHOD'S CODE____________________________#

... some (not too horrible) code ...

#__________________________END OF METHOD____________________________#

While I'm not opposed to commenting, this is a bit much, and fairly hard to maintain. 674 of the 770 lines of this module (88%) were comments or perldoc. among the comments were a list of excuses why they "weren't able to dedicate full-time to the development of this module," pedantic explanations (e.g. of the meaning of TMTOWTDI), and strung out descriptions of alternate implementations they considered before writing the (still crappy) submitted code.

And again in C:

/***
 ***  Echo command line
 ***/

Pretty, but a bit of a pain to maintain although some editors make it easier than others. But the second gotcha is that for consistency, new comments have to be written this way too which people might forget to do.

Of course they sure beat some of our submissions which, unless you count the #! (shebang) line (we don't), had not a line of commenting or inline documentation, or some that had a little at the top of the file but nothing for methods or blocks of code within methods. Neither extreme - too much or too little - is helpful. Consider the maintenance programmer who is looking to change or fix something: he wants to find the area of code that does the work and quickly understand it well enough to make the needed change. Too much commenting and he gets lost, too little (or too many but unfocused comments) and he has to read too much code to figure out what needs changing.

This code is supposed to be an example of peoples' best code; this is their first impression, and metaphorically, most people are barely bothering to shower and dress, much less put on a suit. There is no time limit for writing it; they can do it on their own time and send it in with their resume. I shudder to think what some of them would write in the usual frenetic pace of real world work.

Speaking of comments, here's something you don't want to see in code that the new guy in your group has written, and finally given to you after three weeks of silently working on the code on his own local machine rather than the server like everyone else:

# This is the FooBar module. It contains all of the functionality required
# to connect to and use a foo bar server.
# THIS IS PROTOTYPE CODE!!!
# IT CONTAINS NO DOCUMENTATION AND HAS NOT BEEN CLEANED!!!

We think it no coincidence that this person's initials were BAD.

In a similar vein, in a C comment:

/*
...
* Note the input stream is very much like a 'data streaming' input.
*      it would be nice if a header was included to specify the
*      number of segments and maybe also number of fields, which of
*      course we would verify.
...

Try wishing in one hand and crapping in the other some day and see which fills up faster. We take the data we're given and the systems that have been using it without a glitch for the past 10 years aren't going to change it just because you'd like them to. In the same comment:

...
* Disclaimer: This source code has NOT been compiled.
*             I would estimate a couple more hours of development to
*             eliminate any typos and other compile time errors.
*             Then the fun starts.
...
*/

I'd estimate a few more times for hell to freeze over before we look at your uncompiled code.

An Array By Any Other Name

Whenever you see something like the following, be afraid:

my ($f0,  $f1,  $f2,  $f3,  $f4,  $f5,  $f6,  $f7,  $f8,  $f9);
my ($f10, $f11, $f12, $f13, $f14, $f15, $f16, $f17, $f18, $f19);
my ($f20);

It's quite probable that the writer doesn't understand arrays, whatever the language. Fortunately the code using this wasn't quite as bad as if($index == 0) { return $f0 } elsif($index == 1) { return $f1 } ... but it was almost worse in terms of efficiency:

for ($idx = 0; $idx < $NUMBER_OF_ITEMS; $idx++) {
  $name = "mn" . $idx;
  $refval = eval "\$f". $idx;
  ...
}

For those unfamiliar with Perl, eval takes a string and evaluates it as code. In benchmarks, using eval is 15 times slower than using an array. And Perl has built-in arrays! Perhaps in once-off code that's not much of a difference, but the assignment is actually a real problem we had where I work, that must parse hundreds of thousands (millions?) of records a day, and is currently implemented via an Inline::C module for maximum efficiency. In another module:

# The sample message has 21 fields with unknown names.
# "mn" is taken to represent a meaningful name.
my %AMF_fields = (
  mn0  => undef,
  mn1  => undef,
  ...
);

Interestingly this isn't one of the "global problems" discussed below, although it is thread-unsafe: new calls parse which accesses %AMF_fields directly and then copies %AMF_fields into $self. So, not only was eval used, but rather than allowing people to search and access record fields by name (as required), the fields must be accessed as mn#(), which is handled by an AUTOLOAD - a nifty, but horribly slow mechanism. Furthermore, the code restricts segments to having only 21 fields (notice $NUMBER_OF_ITEMS), an arbitrarily limit that will be exceeded by our data and is absolutely unneeded with Perl's dynamic arrays.

What we can get out of this is: know the language you're using, and if you don't, don't say you do because we (for appropriate values of "we") will find out.

Buffer Overruns

In Perl and other "scripting" languages, buffer overflows and memory leaks are for the most part a thing of the past; dynamically sized variables and garbage collection are the cures for those ailments. But we had a few submissions in C which exhibited "all of the above":

#define MAX_FILENAME_LENGTH.....50

...

char    input_file[MAX_FILENAME_LENGTH];     /* allow for more than 8.3 filename spec */

...

main(int argc, char *argv[])
{
   if (argc < 2)
   {
      ... error ...
   }

   strcpy(input_file,argv[1]);

   ...
}

He did at least check argc, but then he cheerfully copies a string that, on most Unixes, can be up to 32K long, into a 50 byte variable - but feels like he's done his bit for society because he allows for more than DOS! The humanity! When there's no need to copy it in the first place (use a pointer, argv isn't going anywhere - or if you must copy, use strdup() or check the length first). It looked to us like this guy didn't understand pointers - a fairly fundamental element of C programming.

We also noticed, up near the definition of MAX_FILENAME_LENGTH, a

#define MAX_TEXT_IN 132  /* based on business rules */

Whose business rules? Certainly not ours! Another good example of what's known as C programmer's disease.

Allocation Adventures

Same program:

char    text_in[MAX_TEXT_IN];

...

  while ( (text_in_ptr++ = fgetc(inp_file_ptr)) != END_OF_RECORD)
  {
     if (strlen(text_in) >= sizeof(text_in) - 10)     /* if we get within 10 bytes of text_in size */
     {                                                /* extend the text_in buffer */
        realloc (&text_in[MAX_TEXT_IN - 1], MAX_TEXT_IN);
        memset(&text_in[MAX_TEXT_IN - 1], NULL, MAX_TEXT_IN);
        used_heap = 1;                                /* used to free memory on exit */
     }
     ...
  }

Dear Lord, save me from working with such a programmer! Let's see. That sizeof will never return a larger value than MAX_TEXT_IN-10. Ever. sizeof is a compile-time calculation. text_in is a statically allocated variable. YOU CAN'T DYNAMICALLY EXTEND IT. Your realloc will CRASH AND BURN (but ignorance is bliss, right? - he could do well to look at Henry Spencer's 5th commandment for C progammers). And again with the misuse of sizeof; different program:

firstfieldptr->field_id = (char *)
  malloc (sizeof (strlen (fieldbuffer)));

Bloody 'ell, what are people learning these days? What (if anything) do they teach them at these schools? (Although at least half of the responsibility is on the student to learn, but bad schools or books don't help.) Like I said above, sizeof is compile-time. That strlen will never be called, because sizeof just looks at its return type - hopefully, if the right header was included, size_t, but the default int would work too.

The worst thing about the code is that that part works - way more by luck than judgement - because (in our environment, with 32-bit ints) the field should be 4 bytes (3 characters and a null). It's bad because the writer will think "this works" and carry on blithely not learning anything at all until one day it breaks something important.

Of about 5 C programs submitted, only one properly freed the data structures that were allocated. Small memory leaks multiplied over millions of records become big slow memory leaks, and eventually take down the system - a Bad Thing for a production system that many people depend on, and something that's very hard to track down. If you allocate it, free it at the appropriate time in reverse order of allocation, and that goes for any resource, not just memory - file handles, opaque library structures, etc.

Definitions

Don't do this:

#define BEGIN    {
#define END      }

(It'll confuse maintainers and it's ugly and C is not Pascal.) We liked this one:

#define MYNULL '\0'

(what's wrong with the usual null - inline 0 or '\0', not pointer NULL, of course - that you need your very own personal null?) (I also wouldn't have been critical of a sensible name like NULL_CHAR or NUL but the second might be defined by system libraries so should be bracketed with an #ifdef.) Someone had a whole lovingly crafted file of others like the following:

#define SPACE          0x32

There are two things wrong with that one: the ASCII value for space is decimal 32 (not hexadecimal, which is what 0x signifies), and, the best (portable, clean, readable) way to represent a character in C is '[' or ' ' or 'X' or '\t' (for left bracket, space, "X", and tab respectively). In this cases, #defines hinder rather than helping readability, compared to including the values inline.

Reportedly, #define TWO_FIFTY_FIVE 32 has also been sighted in production code (because character 255 was a non-breaking space but then it was determined that normal space could be used, and so instead of replacing the literal 255 in the code with ' ' or a meaningful #define, TWO_FIFTY_FIVE was chosen). There was also:

#define SEGMENTEND '||'

but fortunately he never tried to use it. (Character constants can only be single characters, multiple characters need to be strings, in double quotes, and can't be treated the same at all.)

Along the same lines, splitting on two pipes (||):

foreach my $seg (split(/\x7C\x7C/, $param)) { # process segments
    $seg     =~ /\x7C/g;                # divide seg's name/value

Why 0x7C? Probably because the writer first tried ||, got an error, and didn't realize that meta-characters like | need to be escaped as \|. So not understanding, they scrabble to find something that works. Because if the { compiler, interpreter } doesn't complain, it must be OK, right?

Globals

When globals are used three problems are potentially introduced: thread safety is compromised, the user (of the code) is restricted to a single instance of an object, and (if they are real globals rather than module-local, like C's static and Perl's my variables) access control is lost. Clean code restricts who can change variables or invoke functions or methods as much as is safely possible.

Consider (for the programming task earlier) the following top level (global) declarations for an implementation of the message parser:

my (@Msg,                                  # message array
    $Ixs,                                  # segment pointer
    $Ixf);                                 # field pointer

...

    sub parse {
        my ($self, $param)  =  @_;              # get input parameters
        @Msg                =  ();              # clear the array
        $Ixs                =  -1;              # reset segment pointer
        $Ixf                =  -1;              # reset field pointer
    ...

Well. It looks on the surface to be OO - it even has a new method ( not shown) - but behind the scenes, it's using global arrays to store parsed data. Why? It looked to us like a manifestation of the same problem we've seen before: improper understanding of the language. If the programmer had realized that an array could be stored in a hash - a basic feature of the language - then this kludge would never have been written. What it means is that if you try to create more than one instance of the "object", and parse multiple messages, the last will overwrite the previous instances. Ouch.

But not every use of globals is this heinous; many instances are just a result of laziness: not being willing (or able) to pass around the appropriate variables to functions (or store them in the object or a C struct), e.g.:

static FILE *InFP=NULL;
static long InCnt = 0;

static char Mstr1[GETINPUT_BUFSIZE];
static char EOF_flag = NO;         /* set end-of-file flag */

static char InFileMsg[CUSTOM_MSG_LEN];

static SegmList Gl_SegHead = NULL;
static SegmList Gl_SegLast = NULL;
static SegmList Gl_SegCur = NULL;
static int Gl_SegNumber;

At least he knew enough to use static, but the naming conventions are a little hideous (and short - what's Mstr1?)

While it isn't critical, it definitely raises a flag: why is this programmer exposing the code to the problems above when there is no need?

Red Flags

Besides critical problems like the above - i.e., the programmer doesn't have a good grasp of the language, or logic - there are other red flags I look for when reviewing code. Is it visually clean (indented, use of whitespace to separate blocks of code, comments for blocks, not over 80 columns)? Are variables named informatively, but not too verbosely? Generally the higher the scope the more verbose a name should be; global names should be extremely descriptive and unique, if they are used at all. I had to install and run GNU indent to even read one guy's code; more people would dump him without going that far, but I was curious to see what indent would make of it.

/*Pops out and displays the segment names from the stack*/

void pop()
{
   ...
}

Bit short a name for such a specific routine, what? It's part of a cast of thousands including push(), next(), and prev(). Especially when it's not static (that is, local to the file). In the actual code for the routine, there is an unused integer variable i which brings up the next point: warnings. I run C code we receive through gcc -Wall, it's often very informative. (Heaven help them if it doesn't even compile - someone purposely gave us code he hadn't compiled, so we purposely didn't read it; another chap had a reference to the DOS getch() function). Some warnings are tolerable, but things like omitting return types on functions and letting them default to int are inexcusable in modern C. And it's not as if this -Wall check is some secret trick - the submitter is quite free to try it first and clean up their code before giving it to us!

If you ask someone to do something and they demonstrate that they haven't carefully read the instructions - or, if said instructions are lacking, they don't ask you for more information - this is a clear sign that they don't care about producing a quality result that matches what you specify. We had a lot of people produce solely interactive code for our task, rather than a reusable module (with perhaps some interactive demo code).

Generally anything that breaks any of the aforementioned Ten Commandments for C Programmers is a critically bad omen.

Reinventing Wheels

Perl has a huge library of modules that are (for the most part) well written and maintained, and can do almost anything. Of course I'm talking about CPAN, the Comprehensive Perl Archive Network. C also has some good libraries; one I've started using recently is GLib, the GNU library, which provides facilities for logging, error handling, dynamic strings and arrays, hashes, etc., along with GNet, which provides a plethora of network objects.

Reinventing these things for oneself doesn't show skill and creativity, it shows that you're going to waste the company's time building things that don't need to be built. Sometimes there's a reason to reimplement something - to increase efficiency, perhaps, but only after you benchmark - but not very often, and the author of the original product should be contacted first to see if he'll accept a patch from you to change the library in question.

Somewhat along the same lines, duplication of code in a module or set of modules is a bad sign; it probably needs to be refactored. If the same (or similar enough) code occurs many times it needs to be fixed multiple places when it breaks, and makes for more code for a maintainer to needlessly wade through in general.

Further Examples

This guy didn't know how to get fields in a message into order (see comments) so he prefixed them with a letter and used sort all the time. Not knowing how to make something work is one thing but he never asked anyone (this guy worked for us for several weeks and produced hardly any code and caused us to institute our process of asking for people to complete the task before we consider interviewing them).

  my $mesg = {
    "A_header"  => MHEADER, # I know the A-E is ugly but for some reason
    "B_crc"     => 0,       # the sort function isn't working properly.
    "C_extkey"  => $extkey, # This will do until I have time to investigate
    "D_record"  => $record, # the issue.
    "E_footer"  => MFOOTER,
  };

Holy compressed declarations, Batman!:

struct segmentnode {
struct fieldnode *firstfieldptr; struct segmentnode *next; struct segmentnode *prev;
} *newsegmentptr, *createsegmentnode(), *firstsegmentptr,
*currentsegmentptr;

Please. Separate function declarations from global data. Also a C function declared as taking () means it takes anything (like (...), if ... were legal with no other parameters) which is dangerous and defeats compiler parameter checking (use (void) for no parameters). See this FAQ entry about prototypes for more information.

Speaking of prototypes, declaring your own library functions because you can't remember what system header file they're in (hint: check the man page) is also not kosher:

FILE *fopen();

This might have been the same old-schooler who used original K&R-style declarations, though. Yes, there may be some systems that don't take ANSI C, but we're not using them, so assume ANSI C unless you're told otherwise.

JavaInPerlIsNotLovelyAtAllWeHatessItPrecioussWeHatessIt:

$ref_urlsWithoutQueryStringTranslatedPassOne =
 translatePassOneWithoutQueryStringURLs($self->{urlsTrainingNonQueryGood }, $self->{CONSTANTS});
$ref_urlsWithoutQueryStringTranslatedPassOnePostGrouping =
 groupWithoutQueryStringURLsAfterPassOneTranslation($ref_urlsWithoutQuerySt ringTranslatedPassOne,
 $self->{CONSTANTS});

'Nuff said.

Don't redefine common or short names (e.g. nl which one library defined and another used as a parameter name) or keywords, at least not without explaining the consequences fully to all users of your code:

#     define main ace_main_i (int, char *[]); \
ACE_MAIN ()

(From a hideous monstrosity of a product called ACE.)

#define my_new(s) \ someFunc(_FILE_,_LINE_,s)
     #define new my_new

(Breaks placement new. "But I don't use placement new!", you say (probably having never heard of it). Yes, but <fstream> does. Since <fstream> contains mostly templates, my_new is inserted into the expansion of the header itself. Boom. A forest of strange error messages.)

* does not mean repeat!:

// The global ptr for the time definition.
static SomeClass * FOO_DLLSPEC theClass[SOME_DEFINE]={SOME_DEFINE*NULL};
This happens to work because NULL (in this environment) expands to 0 and a partially initialized array automatically sets the remainder of its elements to zeroes of the appropriate type (here, null pointers).

May I introduce the C struct keyword?:

double ( **barFoo_baz_max       ) = NULL;  // bar foo baz maximum
double ( **barFoo_baz_min       ) = NULL;  // bar foo baz minimum
double ( **barFoo_baz_avg       ) = NULL;  // bar foo baz average

And it goes on, for (literally) thousands of lines.... Also, in this Brave New World of C++, std::vector<> should be preferred for variable-length arrays.

if(donaldDuck[idd].goofy.size() > 0)
     {
       for ( i = first; i <last; i++ )
       {
         i2++;
       }
     }

(Names changed to protect the guilty of course.) Um. This is exactly equivalent to if(donaldDuck[idd].goofy.size()>0) i2 += last-first;. Not to mention that the indenting and spurious braces are ugly in the extreme.)

sprintf( thestring[therow],"%19s","CONT" );

Er. It's a constant string. I know strcpy() is scary (should be using std::string anyway, since that was in a C++ project), but find someone brave and get them to type it!

Conclusion

All programmers suck at some point, and we're still seeing the trickle-down of the boom years where anyone with an editor and the ability to type angle brackets (or use FrontPage) was a "programmer." If you're hiring don't trust their resumes, they lie, or at least twist the truth or tell only a partial story. You'll save yourself an amazing amount of time by getting your applicants to write a simple program - simple enough to let them shine if they're competent but complex enough to require them to use some features of the language and standard libraries - which you can use along with their resumes to choose the ones you want to interview (and make sure you have a technical person present at the interview, or a separate technical interview). Don't worry too much about degrees (and most certificates are absolute crap); experience and problem-solving abilities and communication are far more important. Having said that I realize if a position is flooded with applications sometimes it's all HR can do to "circular file" things like untidy resumes or people without a relevant degree.

If you recognize yourself in these code samples, it's not too late (to go to plumbing school (just kidding)). There are good resources for programming: first of all programmers that you know and respect; then books, the Internet (weed very critically), courses, etc. But you have to take the initiative. If you're already working as a developer, ask for code reviews and code review other peoples' code when they ask - don't be unkind, but do be honest. As well, managers will usually be happy to see you initiate such quality control measures.

For the good programmers out there: be encouraged; while software development seems to be a hugely oversaturated field, your skills are not easily obtained and if you can write safe, efficient, clean, and well-documented code within a reasonable schedule, and can communicate well, you'll be in demand.

Related Articles

I've also written the following articles on development for K5:

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o programmin g task
o help future maintainers
o TMTOWTDI
o Perl
o assignment
o C programmer's disease
o commandmen t for C progammers
o C is not Pascal
o array could be stored in a hash
o CPAN
o GLib
o GNet
o refactored
o task
o prototypes
o ACE
o placement new
o I
o A Modest Proposal for Code Restructuring
o Differenti ating Developers
o Also by czth


Display: Sort:
Certainly Not Logic | 244 comments (203 topical, 41 editorial, 0 hidden)
Dork. (1.06 / 15) (#4)
by Mr Hogan on Tue Aug 12, 2003 at 05:59:06 PM EST

Virgin.

--
Life is food and rape, then tilt.

Virgin? (2.00 / 3) (#10)
by tkatchev on Tue Aug 12, 2003 at 06:49:46 PM EST

What a horrible insult.

(If you have the mind of Beavis and Butthead.)

   -- Signed, Lev Andropoff, cosmonaut.
[ Parent ]

Heh. (1.75 / 4) (#16)
by Mr Hogan on Tue Aug 12, 2003 at 07:36:16 PM EST

You said inslut.

--
Life is food and rape, then tilt.
[ Parent ]

Very clever. (2.66 / 3) (#56)
by tkatchev on Wed Aug 13, 2003 at 05:51:37 AM EST

Not.

   -- Signed, Lev Andropoff, cosmonaut.
[ Parent ]

Oh, Ivan. (2.33 / 3) (#76)
by Mr Hogan on Wed Aug 13, 2003 at 01:14:57 PM EST

I'm not here to be clever or funny or to debate pontificate and discuss issues - I'm here to annoy the computar admins with grating gibberish and - when the opportunity presents itself and I can rouse the enthusiasm to write 'dork' 'virgin' - remind them frankly why it is they're resentful porn-addicted fans of Harry Potter in adult bodies. I feel I deserve some sort of credit and recognition for that.

--
Life is food and rape, then tilt.
[ Parent ]

Well, to be truthful... (1.50 / 2) (#78)
by tkatchev on Wed Aug 13, 2003 at 01:45:46 PM EST

...so am I.

I'm just goading you on.

   -- Signed, Lev Andropoff, cosmonaut.
[ Parent ]

Ah ha! (1.50 / 2) (#82)
by CFK on Wed Aug 13, 2003 at 02:10:25 PM EST

We have a winner!

[ Parent ]
Don't get me wrong. (2.75 / 4) (#86)
by Mr Hogan on Wed Aug 13, 2003 at 02:26:35 PM EST

They're good people think and act no less absurdly than those who came before them fewer things can be less certain - or at least I am not so powerful I can judge them sternly sentence harshly - it's just that - well Jesus if man should live and die as if his life and death did matter then please God don't be cruel tell them Perl doesn't matter.

--
Life is food and rape, then tilt.
[ Parent ]

An objection (4.72 / 11) (#12)
by Pac on Tue Aug 12, 2003 at 07:01:19 PM EST

While amusing, collecting junior programmers or students errors is an endless, and in the end pointless, task unless you are their teacher or mentor and can do something about it.

The openning words of your conclusion, "Most programmers suck", should perhaps be changed to "All programmers suck (at some point in time)".

I own a small software development company. We gave up trying to find the best programmer out there because, first of all, she doesn't exist. And second, when she exists her income expectations are far above our paying capacity.

So I teach. I review code. I call people names over variable naming (IDE programmers love to leave that default new JPanel name "jPanel1" alone - something terrible when you have to discover what the hell "jPanel546" is for), I insist on code restructuring and code rewriting until I am happy with it. I also give people books, URLs, tasks above their heads. And amazingly, people learn.

So I say, don't trust a small piece of code crafted under preasure to choose you programmers. Someone's over or under-documentation tendencies are easily overcome. Someone's fear of the new isn't. Bad habits learned in school, lack of basic or specific knowledge, language specific idioms. All these are irrelevant. Communication skills, willingness to learn and to work hard are far more important.

But just to be a bit agreeable, I agree degrees and certificates are worthless in most cases. We all have been to school and we all know school teaches you an static picture of the past. The only thing worth learning is the capacity to learn more. And certificates are worse yet. They only assert you were capable of learning a lot about a very specific domain at a specific point in time.

Evolution doesn't take prisoners


k5Reply187413 (4.33 / 3) (#14)
by czth on Tue Aug 12, 2003 at 07:11:43 PM EST

The openning words of your conclusion, "Most programmers suck", should perhaps be changed to "All programmers suck (at some point in time)"

Good point, I'll do that, might also help it sound less like I think I'm an exception.

I own a small software development company. We gave up trying to find the best programmer out there because, first of all, she doesn't exist. And second, when she exists her income expectations are far above our paying capacity.

I think we're finding that too, and I'm going to be teaching too. (And arrgh why must you use 'she'; 'he' is accepted as a gender-neutral pronoun....)

So I say, don't trust a small piece of code crafted under preasure to choose you programmers.

No (or little) pressure. They can take hours or even days before they send us the code and their resume.

But just to be a bit agreeable, I agree degrees and certificates are worthless in most cases. We all have been to school and we all know school teaches you an static picture of the past.

I see the most worth of school as giving people time, focus, and resources to learn themselves. And it's not something most people will ever have again (to the same extent, anyway) after they start working fulltime. FWIW I have a BMath in CS, and I had fun and taught myself a lot and think I was in a good program and would definitely do it again. But it may not be for everyone.

Bad habits learned in school, lack of basic or specific knowledge, language specific idioms. All these are irrelevant. Communication skills, willingness to learn and to work hard are far more important.

Only irrelevant to a point. Also in this economic climate companies get to be picky, which unfortunately is not good for graduates looking for their first job. And some people can work hard for a long time and still not "get it" (one big thing seems to be pointers - or references - some people just don't seem to be able to grasp them; see this article by Joel Spolsky).

czth

[ Parent ]

a few things (5.00 / 3) (#21)
by tps12 on Tue Aug 12, 2003 at 08:33:23 PM EST

First, while I don't think you necessarily need to alternate "he" and "she" in your writing, attacking others for doing so can only be considered sexist. The best programmer in my company's engineering department is a woman, and my experience in school indicated that female programmers are on average better than male.

Second, the primary benefit of a CS education, IMHO, is knowledge of algorithms and problem solving. Many (and many of the best) CS programs do not address practical use of specific languages in real-world applications. I think you're better off hiring those with a good understanding of the principles of computer science than those who can pump out well-commented perl. It's easy to teach style and library use, much harder to compensate for an absent four years of intense study.

[ Parent ]

Writing and Modeling (4.50 / 2) (#28)
by cam on Tue Aug 12, 2003 at 09:08:18 PM EST

I think you're better off hiring those with a good understanding of the principles of computer science than those who can pump out well-commented perl.

I think the best attribute a developer can have is the ability to learn, understand and then model a business process. The programming language and platform technologies usually get in the way of expressing that business process.

cam
Freedom, Liberty, Equity and an Australian Republic
[ Parent ]

Real programmers and real code (none / 0) (#40)
by czth on Tue Aug 12, 2003 at 11:20:33 PM EST

First, while I don't think you necessarily need to alternate "he" and "she" in your writing, attacking others for doing so can only be considered sexist.

Maybe by you - it's just confusing to me to alternate. If there's a good neutral pronoun use it, and there is, it just happens to coincide with the male pronoun. That is not at all meant to be derogatory to women programmers, it's just for clarity of language.

Second, the primary benefit of a CS education, IMHO, is knowledge of algorithms and problem solving. Many (and many of the best) CS programs do not address practical use of specific languages in real-world applications. I think you're better off hiring those with a good understanding of the principles of computer science than those who can pump out well-commented perl.

If you can do algorithms, you should be able to write code (i.e. I agree, the coding is the easy part), if not, well that's nice but we don't need ivory tower academics. None of my courses taught me specific programming languages either and I'm glad of it. So while theoretically what you said sounds good, we're in the real world and need real programmers that can write real code. And commenting is about language and communication and should be the easiest part; there's no excuse for lack there.

czth

[ Parent ]

wow, I wish I could work for you.. -nt- (none / 0) (#25)
by Suppafly on Tue Aug 12, 2003 at 09:00:56 PM EST


---
Playstation Sucks.
[ Parent ]
I wish i could work anywhere ... (5.00 / 2) (#46)
by omegadan on Wed Aug 13, 2003 at 01:17:00 AM EST

I can't get a job for love or money, even with an ok resume

Religion is a gateway psychosis. - Dave Foley
[ Parent ]

me neither (5.00 / 1) (#208)
by Suppafly on Sat Aug 16, 2003 at 01:03:43 AM EST

I can't get a job for love or money, even with an ok resume All I know about Bush is I had a job when Clinton was president. Yeh I know how that is.. I'd kill for a halfway tech related job.. great sig btw.. I saw it a while back and added it to my list fav. sayings.
---
Playstation Sucks.
[ Parent ]
She? (3.22 / 9) (#38)
by Stick on Tue Aug 12, 2003 at 11:08:34 PM EST

There's your problem. Hire a man.


---
Stick, thine posts bring light to mine eyes, tingles to my loins. Yea, each moment I sit, my monitor before me, waiting, yearning, needing your prose to make the moment complete. - Joh3n
[ Parent ]
It worked for you. (nt) (none / 0) (#53)
by x10 on Wed Aug 13, 2003 at 04:30:46 AM EST


---YOUR ZEROES ONLY MAKE ME STRONGER---
[ Parent ]

Re: An objection (4.66 / 3) (#114)
by gidds on Thu Aug 14, 2003 at 09:22:23 AM EST

And amazingly, people learn.

No, that's not the amazing thing. The amazing thing is that someone in a position of responsibility cares about good code!

While many managers and leaders claim to care about good code, in practice, IME most simply want the job done quickly; as long as it works, who cares about the code itself? I'm fed up with trying to explain that solution X will lead to far more trouble in the long term, that solution Y will take longer now but will pay for itself, when managers don't care about the long term and simply want the job done now.

And in such an environment, is it any wonder that developers learn not to care about code quality either?

If the environment is one of caring about code quality, then developers will learn to care too. They'll learn to question their code, to improve it, to learn from other people's code, to not be satisfied with the quick fix.

IMO the ideal programmer isn't one who's perfect, but one who's always trying to improve.

Andy/
[ Parent ]

Those will not be forgotten years... (4.00 / 1) (#176)
by Pac on Fri Aug 15, 2003 at 12:58:19 AM EST

During the boom everything was for yesterday and everyone was a programmer. I earned my fair share of money both building applications against impossible deadlines and fixing so-called programmers garbage (also against impossible deadlines, because when the teenager CEO noticed his - with a nod to my critics, "his" :) -  already over-hyped application did not exist it was always too late). Now things seem to be settling down a little and things like proper design, correct specifications and long-term mantainability are again becoming important.

As for teaching, I don't really have a choice. We are a small company and I must be closely involved in the development process (albeit sometimes against my partners whishes). We have a project right now where I had to let one of the lead developers go (because his current existencial crisis was threatening not only the deadlines but the project itself). So yours trully had to cover for the missing person (it was also a "The Mythical Man-Month" like situation, where we were too far into the project to be able to have more people coming in without much harm).

Being there in the trenches, there is no point in me letting people be killed over small stupid mistakes I've already seem and made myself a thousand times over (simple things like bad code formatting, bad code organisation, bad variable naming, absent to null documentation - little things that add up to inevitable nightmares in the near future).

You are completely right, there is no such a thing as a perfect programmer. In this field, either you learn something new every day or you wasted a day.

Evolution doesn't take prisoners


[ Parent ]
I'd like to work for you (5.00 / 1) (#188)
by webwench on Fri Aug 15, 2003 at 10:31:22 AM EST

I'd love to see more emphasis on teaching and mentoring at work in this line of business. The problem seems to be that (1) deadlines, (2) competitiveness, and (3) fear of ridicule get in the way.

[ Parent ]
Amazing (4.00 / 2) (#13)
by OldCoder on Tue Aug 12, 2003 at 07:06:48 PM EST

Well, I hope that whoever has my job isn't one of these dorks.

Problems with the problem statement:
The first line of the second paragraph has

an ASCII string of segments ending with double-pipes ("||")
On the first three times I read I thought it was asking for a string of segments then terminated by a double-pipe. That is, seven segments would have but one double-pipe. Eventually I caught on that seven segments implies seven terminating double-pipes.

Your problem statement doesn't have any information on the amount of data to be processed. If it's 10 lines a day you do it differently than if it's 10 million lines in a day.

--
By reading this signature, you have agreed.
Copyright © 2003 OldCoder

Problem statement (none / 0) (#15)
by czth on Tue Aug 12, 2003 at 07:16:24 PM EST

On the first three times I read I thought it was asking for a string of segments then terminated by a double-pipe. That is, seven segments would have but one double-pipe. Eventually I caught on that seven segments implies seven terminating double-pipes.

Point, I might insert the word 'each' so it reads '... an ASCII string of segments each ending with a double-pipe ("||"). Of course the reader should be able to figure it out from the example but it's nice to be clear.

Your problem statement doesn't have any information on the amount of data to be processed. If it's 10 lines a day you do it differently than if it's 10 million lines in a day.

It's not enough of a big deal, we don't need a bulletproof solution, just one that shows the candidate knows what he's doing, i.e. things like linear searches are fine (in C anyway, less so in perl since it has built-in hashes).

czth

[ Parent ]

Actually, it is a big deal. (5.00 / 3) (#71)
by tkatchev on Wed Aug 13, 2003 at 12:10:46 PM EST

The whole (and only) point of "computer science" is knowing when and how to scale up.

Everything else is really incedental.

   -- Signed, Lev Andropoff, cosmonaut.
[ Parent ]

Lesson learned (3.80 / 10) (#20)
by A Proud American on Tue Aug 12, 2003 at 08:25:59 PM EST

I'll never use C or Perl again.  Talk about unintuitive and antiquated!

____________________________
The weak are killed and eaten...


thats a cool sig -nt- (none / 0) (#24)
by Suppafly on Tue Aug 12, 2003 at 08:59:14 PM EST


---
Playstation Sucks.
[ Parent ]
Why thank you (4.40 / 5) (#26)
by A Proud American on Tue Aug 12, 2003 at 09:01:12 PM EST

It's unique, creative, and Made in America with pride.

____________________________
The weak are killed and eaten...


[ Parent ]
about refactoring (4.50 / 4) (#22)
by martingale on Tue Aug 12, 2003 at 08:58:22 PM EST

Somewhat along the same lines, duplication of code in a module or set of modules is a bad sign; it probably needs to be refactored. If the same (or similar enough) code occurs many times it needs to be fixed multiple places when it breaks, and makes for more code for a maintainer to needlessly wade through in general.
This is actually a difficult question. It isn't clear to me that refactoring is usually the right answer for maintainability. Certainly, if you're implementing a well defined, generic algorithm in several place, then refactoring is the answer. You can also use templates to parameterize away the small differences.

However, I've also found cases where I'd written common code which needed to be split up again, because the simplest (read understandable/maintainable) common solution didn't quite fit all the individual cases. Invariably, this is for tasks which aren't well defined CS algorithms, but rather involve user interactivity and real world special cases.

What's really annoying I found is that when you make a small change to the common piece of code, suddenly it breaks one of the locations where it is used, because the change in the common logic is no longer applicable to all locations equally. So in this case, code generalization actually goes too far, and you only notice it when it's too late (or when a requirement changes, more probably).

Writing for maintainability is a bloody mess.

programming jobs suck.. (3.80 / 5) (#23)
by Suppafly on Tue Aug 12, 2003 at 08:58:40 PM EST

it really sucks that to get an entry level job doing programming, you are expected to have several years of experience.. if you expect everyone you hire to have as much experience as you, make sure you are actually paying what such experience deserves, otherwise hire someone right out of school and give them a little breathing room to learn.
---
Playstation Sucks.
Point (none / 0) (#35)
by czth on Tue Aug 12, 2003 at 10:24:18 PM EST

I graduated summer 2001 and a lot of the people we're considering have years more experience... it's depressing, really. I wish it weren't so. I wouldn't mind teaching someone if they demonstrated the vaguest hint of teachability, either - I like teaching.

czth

[ Parent ]

Programming jobs suck in general (none / 0) (#37)
by Stick on Tue Aug 12, 2003 at 11:04:30 PM EST

There are easier ways to make money, and you can code in your free time.


---
Stick, thine posts bring light to mine eyes, tingles to my loins. Yea, each moment I sit, my monitor before me, waiting, yearning, needing your prose to make the moment complete. - Joh3n
[ Parent ]
Funniest thing i've ever heard (4.80 / 10) (#43)
by omegadan on Wed Aug 13, 2003 at 12:56:43 AM EST

"God damnit this Visual C++ is a piece of SHIT!! It only allows 16,384 global variables!"

Religion is a gateway psychosis. - Dave Foley

In a similar vein (4.00 / 1) (#58)
by Big Dogs Cock on Wed Aug 13, 2003 at 08:49:35 AM EST

This comment (names changed to protect the twat):
/* Function : idiot
Returns : void
Parameters: none
Uses globals: stuff, thing, wibble, every, fucking, man, and, his, dog, twice, over, i, am, a, twat, very, long, list, goes, here*/

People say that anal sex is unhealthy. Well it cured my hiccups.
[ Parent ]
Attn all wiz-kid programmers! (4.60 / 5) (#45)
by Beneath the Waves on Wed Aug 13, 2003 at 01:11:19 AM EST

Move out of your parents house, and run the world now while you still know everything!

Seriously, rather than just bitching about other people's skills, why don't you do something more useful and productive, like say write your own book on how to program in C?

You can lead a horse to water, etc. (4.60 / 5) (#47)
by czth on Wed Aug 13, 2003 at 01:27:20 AM EST

Do you think that the existence of good C books infringes at all on the lives of the people writing the C code above? I'm not saying there are no good C books - there are many - just that people don't read them. Or man pages. Whose fault is that? It might be traceable back further to lower education that didn't encourage reading enough (I grew up without a TV; I'm extremely glad I did; I read a lot and taught myself to program early) or teach comprehension well enough, or poorly taught computer science courses in school, but whatever the handicaps a person had the responsibility is finally theirs.

If we can hire someone who's teachable - willing and able to learn - I'd be more than happy to devote what time the company allows (and even some of my own if it looks to be well spent) to patiently making up the deficit in their education, as long as the gap isn't too great.

Oh, and FWIW, I've been (a long way) out of my parents' house for over 5 years and am getting married at the end of the month. So yes, while I may have been a geeky programmer kid without at TV, I consider it time well spent and things are working out just fine for me :P.

czth

[ Parent ]

Object-oriented! Extendable! Production-level! (4.83 / 6) (#49)
by ZorbaTHut on Wed Aug 13, 2003 at 01:56:50 AM EST

Wait, what? I just read through your task, and I'm tempted to look up what company you work at just so I don't accidentally apply there :P

First: what do you mean by object-oriented? Do you want the interior workings of it to be OO, or just the interface?

Second: what do you mean by extendable? Do you want it to be able to handle new data types? The same data types, but with different input mechanisms? The same data types and input mechanisms, but different data *structures*? What exactly do you want extendable? I could just make everything extendable, but it'll be about ten times longer than necessary, and probably much slower too . . .

Third: Define "production-level". Personally, my production-level code looks a lot like my quick hacks - they're both designed to stand up to a reasonable amount of abuse, but choke with a useful error message whenever something truly bizarre comes down the queue. Neither of them are designed to be as fast as humanly possible - if they're the right algorithmic complexity, it's Good Enough until proven otherwise. So what do you want here? Hand-tuned assembly that handles all possible errors?

Fourth: "ordered traversal". I'm personally wondering if you expect this to slot into an STL iterator template or not. Probably not. I still do wonder what sort of traversal you want.

I realize that this might be the mark of a bad employee - one that asks questions about their tasks instead of just doing it - but I personally would worry about writing up a good solution, coming in, and finding out that your definition of "production-level" is entirely different from mine. I probably could do what you want *if* I knew what it was.

At my job, generally I'd have some idea of what this parser was going to be used for, and this would answer all my questions. In this case, it's basically programming in a vacuum.

I agree (4.66 / 3) (#51)
by epepke on Wed Aug 13, 2003 at 03:39:50 AM EST

While it's fun to laugh at the some of the terrible boners czth got in his submissions, it's also fun to criticize the problem statement. Of course, perhaps it was meant to be as vague as problem statements one usually gets in industry, but that's quite a different environment, because one usually has the opportunity to talk about it. But anyway, specifics:

First: what do you mean by object-oriented? Do you want the interior workings of it to be OO, or just the interface?

I don't know what he means by that, either. I know what a regular expression parser is, which is all that is needed from this one. Perhaps he wants to see if people can use the object features of a language, but this says very little about the design. Is this to be an object that contains a suite of functionality, including parsing and searching? Is it a separate box as a parser that returns self-searchable objects? Coarse-grain or fine-grain? I don't know, but perhaps it's a guessing game.

Second: what do you mean by extendable? Do you want it to be able to handle new data types?

Extendability (or extensibility) is usually considered a holy grail and is not specified more precisely than this. Nevertheless, in a test situation it doesn't quite work so well. I could write a parser for this that was extensible up the wazoo, but it would come out looking something like Xerces and would probably get me pegged as one of those people who would waste his company's money.

Third: Define "production-level".

Not to mention that this conflicts with extensibility. There's always a tradeoff.

But, in any event, the test says nothing about the "production" environment. In the article, he praises the Perl modules and GNU libraries and basically implies that, if you reinvent what's in them, you suck. But, nowhere on the test does it state that linking in GPL libraries, or in fact any free libraries without permission by a team of lawyers, is part of this "production" environment. I know of plenty of production environments where that would be fine and plenty where they'd have an embolism if you even asked. In any event, the convention for tests since grade school is "your work only unless otherwise specified." It's another guessing game.

Fourth: "ordered traversal". I'm personally wondering if you expect this to slot into an STL iterator template or not. Probably not. I still do wonder what sort of traversal you want.

I'm especially confused about this one. My best guess is that the order of the segments must be preserved, but what of the fields within a segment? And what are the searches going to be, anyway? Just exact matches? Hash tables are fine. Begins-with searches? Hash tables aren't fine, so you need some other structure. Contains searches? Another problem.

Often, this stuff is impossible to find out, but not always. You can tell a lot if you know that the searches are going to be initiated by someone who types or if they come from a table of a database.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Answers (4.00 / 1) (#66)
by czth on Wed Aug 13, 2003 at 11:42:27 AM EST

OO is a hint because without it people won't write objects (or the C equivalent, basically an opaque struct with a set of functions with a common prefix that take the struct as a first parameter like C++'s this). Think "ability to have two parsed message objects active at the same time" and see the Globals section in the article.

As to linking in libraries, it's an exercise and we won't be actually using it for our production system, so you're not restricted by licenses. I'd say general purpose libraries (e.g. glib like I mentioned in the article) are OK, but if you happen to dig up a library that already does this sort of exact thing we'd prefer you forgot about it and wrote your own code (we primarily want to see how people design and code a solution to a problem, so if the libraries used get in the way of that don't use them, but using something that provides data types (e.g. growable arrays, hashes, strings) that's fine; we don't need to see Yet Another C String Library written).

We do take questions, as I said in the other reply. For searching, something many people miss is that there is no guarantee of uniqueness in field or segment names. Yes, order must be preserved (if you aren't told otherwise assume that no information should be lost). Exact matches on segment and field names are fine ("allows... searching for segments and fields... by name"); no content searches are required.

It's actually a very simple problem, but with a lot of room for error (and embellishment). Which is why we like it :). Yes, we could make it more handholdingly precise but (as you surmised) we prefer not to go too far that way because we don't want to have to hold people's hands on the job.

czth

[ Parent ]

That's fine (4.50 / 2) (#75)
by epepke on Wed Aug 13, 2003 at 12:31:02 PM EST

As to linking in libraries, it's an exercise and we won't be actually using it for our production system, so you're not restricted by licenses.

But it isn't specified anywhere in the problem description that it is so. This is my main problem. If it is to be a fair test, then it should so specify.

It's actually a very simple problem, but with a lot of room for error (and embellishment). Which is why we like it :). Yes, we could make it more handholdingly precise but (as you surmised) we prefer not to go too far that way because we don't want to have to hold people's hands on the job.

If you think that this kind of test correlates with needing to hold people's hands on the job, well, you're a very young man anyway. Most often, it's just the luck of the draw.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
miss what? (none / 0) (#92)
by Sacrifice on Wed Aug 13, 2003 at 06:58:01 PM EST

The remaining strings are fields; the first 3 characters are the field name, the rest are the data.
When you say there are fields, and that they have names, it is understood by default that the fields are uniquely named.

The possibility that the field names may be repeated (so that the field's values are ordered/multivalued) could possibly occur to the paranoid, but it wouldn't to me (except as a type of malformed input to detect).

[ Parent ]

Why do you understand that by default? (none / 0) (#93)
by czth on Wed Aug 13, 2003 at 07:46:11 PM EST

Consider XML. Elements do not need to be uniquely named. Similar idea.

You understand it by default. Be careful what you assume.

czth

[ Parent ]

What is a field? (5.00 / 2) (#152)
by Sacrifice on Thu Aug 14, 2003 at 04:58:48 PM EST

Why would I assume that a field has a unique name? Because, that's what the purpose of the name is: to identify the field, and when you say "field", you are referring to an object analogous to the field in a form. That is, the field occurs once. Don't be intentionally sly about not saying what you mean, and then act surprised when people don't read your mind. Forget about what you "meant" when you wrote it; it means only what the words evoke in the shared context. A reasonable person would explicitly mention that field names may reoccur, and that the associated values should be considered ordered, multiple values of the field name; that's the exception, not the rule, or at least make such a case the first example. You would probably do this if you weren't trying to catch people out.

[ Parent ]
Does Object Oriented have any place in a spec? (4.50 / 2) (#59)
by Lacero on Wed Aug 13, 2003 at 09:41:00 AM EST

Why specify the inner workings of the code? If it fulfills it's purpose surely it doesn't matter if it's OO or not?

If you want OO for an interface, then you need put the interface into the spec directly.

[ Parent ]

It came about honestly (4.00 / 2) (#70)
by czth on Wed Aug 13, 2003 at 12:08:05 PM EST

Apparantly not all Perl programmers grok OO Perl, and we had some writing Perl 4 style even. So we did need to put it in. Plus see replies to other comments in this thread.

czth

[ Parent ]

Thats a fair point (4.00 / 1) (#90)
by Lacero on Wed Aug 13, 2003 at 03:23:56 PM EST

I was viewing the task as I would one given to me by my bosses, not as an employment check.

Thinking about it it's fair to ask candidates to demonstrate OO ability in an example program.

[ Parent ]

Some of us would argue... (none / 0) (#191)
by Merc on Fri Aug 15, 2003 at 12:27:36 PM EST

There is no such thing as OO perl.

OO Perl is like saying "dry water". It's an evil, ugly hack, that is a real pain to use. I started using Perl in '95 and it was great for a while, but although CPAN is great, the language itself hasn't aged well.

Unless you desperately need obscure CPAN modules, there are so many languages that are better.



[ Parent ]
Actually OO perl is great (none / 0) (#192)
by czth on Fri Aug 15, 2003 at 12:41:26 PM EST

It fits perfectly in with the language, and is as flexible as you need it to be. Just because it doesn't fit your personal repressed idea of OO (what's that? Java?) or isn't B&D enough for you doesn't mean it isn't OO, and I - and many other perl users and CPAN contributors - find it a great system in which to program.

czth

[ Parent ]

ooh, touchy aren't we? (none / 0) (#194)
by Merc on Fri Aug 15, 2003 at 01:03:46 PM EST

I'm not saying it's not a great sytem in which to program... well... that is what I believe, I just didn't say it earlier. I just think that Perl and OO are two words which don't belong in the same sentence.

In Perl, there's the whole junk of, creating $self as an anonymous hash, "bless $self", having to remove "$self" as a parameter to all "method" calls, and then there's just the ugliness of $self->{NAME}.

If you think Java is OO, then you haven't been exposed to enough OO programming. Try Ruby or better yet, Smalltalk.

Doing OO programming in Perl is like tacking a huge spoiler onto a Honda Civic. It's ugly and it doesn't work very well. Perl has its strengths, OO programming just isn't one of them.



[ Parent ]
Good points (none / 0) (#199)
by czth on Fri Aug 15, 2003 at 03:12:08 PM EST

And they will be fixed in Perl6 which will also bring peace to the middle east, cure world hunger, and make a darn good cup of coffee. But they still aren't enough to say Perl isn't OO - it does inheritance and polymorphism, although for the encapsulation aspect you sort of have to shut your eyes and pretend that you don't see what you aren't meant to see (but it sure helps to be able to violate that when debugging :>).

czth

[ Parent ]

Exactly (none / 0) (#211)
by epepke on Sat Aug 16, 2003 at 04:25:25 AM EST

There are

  1. Real O-O languages
    (Smalltalk, Modula, to some extent Objective C, to a smaller extent Java)
  2. Pseudo-O-O languages
    (C++, most of Java)
  3. Jeezie peezie, you're sure reaching and putting a lot of cruft in so that you can claim that this is O-O
    (Perl, Ada, CLOS)

The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
We're happy to answer questions (5.00 / 3) (#63)
by czth on Wed Aug 13, 2003 at 11:02:16 AM EST

(I replied to this earlier, either Scoop ate the reply or it got zero modded by some twit... or I forgot to hit 'Post'.... :).

Object-oriented shouldn't need to be said, but the spec evolved over time, and unfortunately it did need to be sent as sort of a hint to people. The main part of "OO" we want is encapsulation - their parser hands back an opaque token (pointer, reference, whatever) that can be used to walk the message or search for segments or fields by name.

"Maintaintable" is probably a better word than extendable here. Again a hint: not just a one-off hack, try to show off the kind of code you'd write if you worked here; make it easy to make (minor) changes if we need to; divide code into easy to manage functions, etc.

Production-level is another hint: write comments, use whitespace, use meaningful names, etc. You're right in your title when you point out that it looks like a string of buzzwords!, and we might change them (that's what a community site is for, right - provide discussion and generate change where needed).

However in "production-level" there is no connotation that would require "hand-tuned assembly"; get it right first, then benchmark, and rewrite the hotspots only if the code will be used sufficiently to make it worthwhile. And in some (very rare - not here, anyway) cases that will include assembly.

If you were writing the code in C++ (we used to allow people to use their language of choice - if they want to be witty and use INTERCAL, they go in the round file - but have been asking for C lately), providing STL-type iterators for traversal would be quite acceptable, and allow leveraging of existing libraries (this might also come under "extensible"). The sort of traversal we want is a function to go to the start of the message, a function to advance to the next segment and return its name, and a function to advance to the next field within the current segment and return its name and value. Plus, as mentioned in the spec, functions to search forward for segment and fields by name.

Make no mistake, asking questions is a good thing (we're happy to see people ask questions), although there is enough information in the spec and it's as or more concise than most real world specifications. Of course there's a balance between asking questions and independence; I'd say if a wrong assumption means a major rewrite then ask up front, if not, write the code and state your assumptions.

You don't have to know what the parser is for exactly to write the code, but usage patterns might be handy. But we don't care about the difference between the code someone would write if they knew that (e.g.) we'd mostly skip to a given segment and iterate through all its fields, or if we'd always be doing searches by segment and field and hardly ever iterating. Clean and correct code is paramount in such an exercise as this and optimizations come way later (premature optimzation sucks).

HTH,

czth

[ Parent ]

So... (2.50 / 2) (#155)
by synaesthesia on Thu Aug 14, 2003 at 05:54:10 PM EST

...if you can pick your language, and the languages's standard iterators over encapsulated data are acceptable, would the following code be acceptable?

(I don't know Perl, but I assume one could write it similarly.)

<?php

// A message is an ASCII string of segments each ending with a
// double-pipe ("||"). A segment consists of one or more pipe ("|")
// separated strings; the first string is the segment's name.
// The remaining strings are fields; the first 3 characters are
// the field name, the rest are the data.  We assume that
// the input data is valid.

// Returns an array of arrays, keyed by segment name
// followed by field name.

function encapsulate($messagestring)
{
    $segments = explode("||", $messagestring);
    foreach ($segments as $segmentstring)
    {
        $fields = explode("|", $segmentstring);
        $segmentname = $fields[0];
        unset($fields[0]);
        foreach($fields as $field)
            $result[$segmentname][substr($field, 0, 3)] = substr($field, 3);
    }
    return $result;
}

?>

Sausages or cheese?
[ Parent ]

wouldn't be acceptable, sorry (none / 0) (#241)
by phred on Wed Aug 20, 2003 at 05:07:51 PM EST

any language or program that has "explode" makes the HR folks nervous.

[ Parent ]
Sorry, please play again (none / 0) (#242)
by czth on Fri Aug 22, 2003 at 04:52:46 PM EST

No. Any twit can return a nested structure (my @msg = map { s/^([^|]*)//; [$1, /^\|(...)([^|]*)/g ] } split /\|\|/, $input; didn't use a hash because names aren't necessarily unique). Return an object with a decent set of methods.

Besides, not to burst your bubble but PHP sucks.

czth

[ Parent ]

Yay! I get to rant (4.50 / 4) (#60)
by RyoCokey on Wed Aug 13, 2003 at 09:56:08 AM EST

On a related note, I find that using different languages helps readibility quite a bit. In particular, I find that Pascal-based languages like Delphi and TMT-Pascal are a lot easier to read than C-related ones (I.e. look like ASCII factory explosions.)

Writing code that looks more like english helps reduce the number of comments needed. Plus, I type words and alphanumerics far faster than all those retarded brackets and pipes.



farmers don't break into our houses at night, steal our DVDs and piss on the floor. No
Reinventing the wheel--a true story (4.50 / 4) (#64)
by epepke on Wed Aug 13, 2003 at 11:08:18 AM EST

The comment about reinventing the wheel reminds me of a true story.

So, I'm at my previous job, internal development for a large hotel firm. A demand comes down the chain--we need a report for workers' comp insurance costs. It's something like a six-million-dollar prospect, and it needs to be done right now, or we lose big bucks. It's easy enough to do the query and takes only 30 minutes or so. But the insurance company will only accept it in a spreadsheet with a certain format, with various highlighting. It's too big to do by hand.

My buddy Ram, one of the few people I am willing to call a colleague, finds some old code using a Perl module to put together spreadsheets. Trouble is that it doesn't work with the new Perl and the new Excel. He's a very bright guy, but he spends two days trying to get it to work.

I decide that something needs to be done. From my old fart memories of the distant past, I remember there used to be a thing called SYLK. I do a search and find a guide on a Russian (!) website. A couple of hours later, and the problem is solved.

Did I reinvent the wheel? Do I feel justified in having done that? Hell, yes, to both.

Being a good employee, I write up how to write SYLK files in the local (pretentiously named) Book of Knowledge.

Fast-forward a year. I have decided to leave the locality. I am interviewing and attempting to train my replacement. He reads the section on SYLK and comes to me, a superior gleam in his eye, and asks if we had tried such-and-such Perl modules, hoping to put one over. I tell him the story, and it seems to shut him up. Nevertheless, to the day I leave, I hadn't seen him do much actually productive. Sure, there's a ramp-up time, but when I was there, within a week of starting I had refactored the server code.

And, maybe he's right. And maybe czth is right. Because my replacement (probably) has an income, but I'm making scraps off of what consulting gigs I can find.


The truth may be out there, but lies are inside your head.--Terry Pratchett


I don't consider that reinvention (4.66 / 3) (#68)
by czth on Wed Aug 13, 2003 at 11:58:44 AM EST

Did I reinvent the wheel? Do I feel justified in having done that? Hell, yes, to both.

Not really - what wheels were out there didn't work for you so you took what you could find (didn't reinvent it, as you describe it) and used it; actually you went one better and used some very specific knowledge you had to find an existing solution. Now, if you'd decided to needlessly reverse engineer the Excel format from scratch, that would merit condemnation.

BTW, we use the Spreadsheet::WriteExcel perl module here - I have a patch in it :) - to generate some of our reports, but I'm not the one directly using it so I can't attest to how good it is (but the reports look OK), and maybe it wasn't around when you needed it. I looked for SYLK (interesting first hit on Google BTW) but didn't find anything more than "symbolic link"....

What I take issue with is the really bad cases like someone inventing their own (Yet Another) XML parser, for example, or reimplementing something first written in perl in C for speed without having run benchmarks. Or simple things like people writing look_for_character_in_string() when strchr() exists and does the exact same thing and is quite probably written in optimized assembler for the target platform.

czth

[ Parent ]

Well... (4.00 / 1) (#72)
by epepke on Wed Aug 13, 2003 at 12:20:46 PM EST

Not really - what wheels were out there didn't work for you so you took what you could find (didn't reinvent it, as you describe it) and used it; actually you went one better and used some very specific knowledge you had to find an existing solution.

I wrote all the code myself. And, whether it would be re-inventing the wheel in a professional context is quite a different matter than what is considered re-inventing the wheel in the test context. That's the problem.

What I take issue with is the really bad cases like someone inventing their own (Yet Another) XML parser, for example, or reimplementing something first written in perl in C for speed without having run benchmarks.

The problem is that a lot of XML parsers really suck. There really isn't much decent out there that doesn't require reading the entire stream into memory. I'd like something like a language that did for XML what Perl did for ASCII files. I don't think that the realm of invention has been exhausted here. Although I don't think it requires a new language, it may require some iterators.

Rewriting perl in C for speed seems strange. There are Perl jobs, and there are C jobs (and C++ jobs and so on). Sometimes there's overlap, but there are a lot of tradeoffs.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Reinvention, perl and C (3.00 / 1) (#77)
by czth on Wed Aug 13, 2003 at 01:41:07 PM EST

Well, reinvention to me means writing from scratch when something that does (near enough - and that's of course problem-dependent; if it does what you want but is too slow then it's not near enough) what you need already exists and is available (and "not available" could just mean it's not under a license that you can use).

Rewriting perl in C for speed seems strange. There are Perl jobs, and there are C jobs (and C++ jobs and so on). Sometimes there's overlap, but there are a lot of tradeoffs.

It's actually the best of both worlds, and a lot of core and CPAN perl modules are written in C. This gives the benefit of C's speed where you need it but the flexibility/development speed/memory management/safety/etc. advantages of perl the rest of the time. Of our own modules only about 3 or 4 have C backends; those are for places we've profiled and determined that the increased complexity would be worthwhile (and it's hidden from users anyway).

czth

[ Parent ]

Hmmm??? (none / 0) (#104)
by epepke on Thu Aug 14, 2003 at 01:45:59 AM EST

It's actually the best of both worlds, and a lot of core and CPAN perl modules are written in C.

That's true, but that fact is (and should be) of no importance to me when I'm using a library or module. All I should need to know is that it works according to how the external interface was designed.

My point about rewriting is that it doesn't make sense to me to rewrite an entire Perl program in C based on a belief that it somehow makes it faster. Perl's features, including regular expressions, hashes, and references (which allow hashes of references to hashes of references to arrays of reference to, well, you get the point) often lead to overall structures of code that look very different in Perl with respect to C.

Case in point, another job I once had. I had a list of some few hundred contracts, each of which had the terms of the contract written as comments. I had to convert all of these terms to machine-understandable forms. So, it was a small natural language problem. To me, this had "Perl job" written all over it. I applied cleaning up and some generative grammar and transformational grammar to the strings, tweaking the code until all but the strings could be understood by the machine. First cut got all but 100, next got all but 50, then all but 12, and so on. This seems a quintissential Perl hack to me.

There is just no way that I would use the same approach if I somehow were required to write this in C or LISP or some other language. Also, rewriting such an algorithm in C just because of the conceit that "C is faster" would make no sense, because it would be bound by the regular expression package, which is likely as optimized in Perl as anywhere.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Rewriting (none / 0) (#117)
by czth on Thu Aug 14, 2003 at 09:46:21 AM EST

My point about rewriting is that it doesn't make sense to me to rewrite an entire Perl program in C based on a belief that it somehow makes it faster.

Of course not. We profile and then rewrite the functions/methods/modules that seem to need it, just dropping to C where necessary. And in most cases we're talking saving hours of time daily, not just seconds or minutes.

czth

[ Parent ]

Exactly (none / 0) (#120)
by epepke on Thu Aug 14, 2003 at 10:13:25 AM EST

Of course not. We profile and then rewrite the functions/methods/modules that seem to need it, just dropping to C where necessary.

There is no point in optimizing the 90% of the code that only takes 10% of the resources.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Why not Lisp? (none / 0) (#159)
by donio on Thu Aug 14, 2003 at 06:59:09 PM EST

I understand why C wouldn't be ideal for this but why not Lisp? To me
this looks like the sort of problem Lisp would be great for. (By
"Lisp" I mean something modern, Common Lisp or possibly Scheme)

[ Parent ]
I didn't say it wouldn't be good (none / 0) (#178)
by epepke on Fri Aug 15, 2003 at 02:26:26 AM EST

I said that the approach that I would take in LISP would be significantly different from the approach I took in Perl.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
I think I am missing the point. (none / 0) (#74)
by Tezcatlipoca on Wed Aug 13, 2003 at 12:28:09 PM EST

But I am sure that is down to my wholly incapacity as a programmer rather than due to a contrived writing style that wishes not to make a point.

Might is right
Freedom? Which freedom?
Sometimes you just want to rant :> (none / 0) (#79)
by czth on Wed Aug 13, 2003 at 01:49:44 PM EST

But in retrospect, if this isn't voted up I might rewrite it with the kinder, gentler point of view of "common mistakes beginner programmers make and how to avoid them", and make it more of a teaching article. That would also satisfy some other commenters who either think I'm too full of myself or aren't happy with me declaring the whole world inept. It's hard to keep up a good ranty demeanor this long!

czth

[ Parent ]

Certainly rings a bell! (none / 0) (#80)
by rujith on Wed Aug 13, 2003 at 01:51:16 PM EST

There's LOTS of code by "programmers" that I've been forced to rewrite, and the result is almost invariably shorter, clearer, and more efficient. Most programmers don't deserve that designation, and shouldn't be allowed near a computer.

By the way, I've never had a TV either, like czth. My wife brought hers into the house, but she and I probably watch at most one hour of TV per month.

- Rujith.

Minor point re: TV (4.00 / 1) (#83)
by czth on Wed Aug 13, 2003 at 02:12:10 PM EST

I didn't say never had one, just not when growing up ("formative years"); when I was 9 we moved to Canada (from the UK) and got one I think a year or so? after, but it was too late, I'd already learned to read and had a decent imagination...*. I also have a TV now but mainly watch DVDs and a few select programs.

* I was reminded of an ad by Infocom regarding their text adventures, going up against the (then poor) graphical games:

You'll never see Infocom's graphics on any computer screen. Because there's never been a computer built by man that could handle the images we produce. And, there never will be. We draw our graphics form the limitless imagery of your imagination - a technology so powerful, it makes any picture that's ever come out of a screen look like graffiti by comparison. And nobody knows how to unleash your imagination like Infocom.

(Source; some also say that a narrow-minded devotion to text adventures is why Infocom failed, though.)

czth

[ Parent ]

My experiences (5.00 / 5) (#84)
by lauraw on Wed Aug 13, 2003 at 02:21:41 PM EST

A few random comments....

Don't worry too much about degrees (and most certificates are absolute crap);

Definitely. When I was a manager at IBM a few years ago, there were about a dozen people in my group. Among the degrees were a philosophy PhD, a BA in percussion performance, a BS in geology (me), and an AA in progress at a local community college. Skill and talent didn't seem to have much, if any, correlation with degrees. One year I wanted to give the guy who was still in community college a big raise because he was way underpaid for his skill level and productivity. I expected the higher-ups at IBM -- HR, the 3rd-level manager, and so on -- would object because he didn't have a degree yet. They didn't; the attitude was that pay should be tied to performance, not degrees. (The company had always said that was the attitude, but it was nice to see them behave that way.)

getting your applicants to write a simple program.

This is a very good idea. It doesn't necessarily have to be a large assignment that they do on their own time and submit, either. Just having people write out code for a specific task on the whiteboard or a piece of paper during the interview can tell you a lot about how they program. Back when I worked at Taligent, a serious C++ pioneer, another of the interview techniques was to show someone a one-page function and ask, "What's wrong with this code?" It was interesting to see which problems they noticed and which they didn't, because the sample had many problems ranging from syntax errors to misleading comments to bad (or missing) OO design.

An advantage to having them write the code in person is that you get to see them think. I interviewed at Google last month, and several of the interviewers asked me to write solutions for small but non-trivial problems. I made sure to think out loud while I was doing it: "Well, first I need to have an array to store the output in. And I can compute its size by ...." That way the interviewers could see that I understood the problem and thought through the issues rather than doing it by brute force or random luck. Even when I don't know the answer, I try to work through the problem out loud or on the whiteboard so that they at least know that I'm able to think and brainstorm.

Though I haven't used it myself, I think your technique of giving people an outside programming assignment is a good one too. Even though mediocre programmers will be able to complete the assignment by using reference materials, tutorials, or whatever, they're just not going to see as much as the really good programmers will. They'll tend to leave some issues unaddressed because the issues don't occur to them, not put in decent documentation because they don't really see that it's needed, and so on. I think you'd be able to tell the difference pretty easily.

Laura

Something is wrong with the world on a day when Kuroshin is faster than Slashdot

Not too surprising (none / 0) (#216)
by Merc on Sat Aug 16, 2003 at 03:11:10 PM EST

[...] a BS in geology (me) [...]

Not too surprising that someone who doesn't have "the appropriate degree" for a job would consider that having the appropriate degree for a job doesn't matter much. :)

For what it's worth, I'm working as a programmer/engineer type at a small company where I, a B.Sc. in Engineering Physics, work alongside guys with PhDs and Masters degrees in "the appropriate fields".

I think one thing I lack from not having that degree is some of the terminology. I often hear them using various terms that I have heard before but I don't understand the subtle meanings.

On the other hand, I can sometimes use different analogies, or approach problems from different directions. I suppose it's a tradeoff.



[ Parent ]
Re: Not too surprising (5.00 / 1) (#234)
by lauraw on Mon Aug 18, 2003 at 05:37:34 PM EST

Not too surprising that someone who doesn't have "the appropriate degree" for a job would consider that having the appropriate degree for a job doesn't matter much. :)

Good point. :-) I inherited the group from someone else, though, so my biases didn't influence its makeup too much. Of the folks I hired, one was a fresh-out-of-college CS major, one almost had an MSCS, and I can't remember about the other two. Oh, and the guy with an AA converted from part-time to full-time.

When I was trying to hire people, I was sometimes amazed at how little a BSCS seemed to be worth. One time I went down a job fair at Cal Poly (which has a very good reputation) and talked to lots of CS students. (Everyone who wanted a Java job talked to me, because I was the only one hiring Java engineers.) The majority of them didn't have a clue about writing non-trivial software, about OOD, or about any of the semi-advanced Java idioms. Yet they seemed to be getting decent grades and surviving school OK. On the other hand, there were a couple of really good people who I ended up hiring, so it was a worthwhile trip.

I don't think that's specific to CS degrees, though; it probably applies to most fields.

One thing that helped me "catch up" with CS folks is a lot of reading. Back when I was getting into programming, and occasionally even now, I'll read books on algorithms, OOD, new technologies, or whatever. Before the Google interview I embarked on another reading binge, starting with an algorithms book. (That was a good thing; their interviews are very heavy on algorithm design and performance characterization.) Now I've worked my way through my book collection to the obscure stuff like the Pattern Languages of Program Design series, buying about 10 new books in the process.

I agree that having a different perspective can help. A science background seems to help too, because it tends to teach people critical thinking. One place I interviewed in Chicago after college was doing modelling software for options trading, and most of their programmers had degrees in physics because of all the bizarre time series and partial differential equation stuff they were doing. I didn't end up getting that job, which in hindsight is probably a good thing because I think all that math would have made me crazy.

Laura

[ Parent ]

Puritans are a pain in the ass (4.40 / 5) (#85)
by the on Wed Aug 13, 2003 at 02:26:34 PM EST

Whether they're telling you how evil you are for skipping church or they're telling you what to write in your comments.

BTS My favorite comment was this:

/*
* This code was extremely hard to write so don't expect
* me to explain how it works in a bunch of comments
* in this code.
*/

The author: yours truly.

--
The Definite Article

My father tells a story (5.00 / 6) (#111)
by Karmakaze on Thu Aug 14, 2003 at 09:01:53 AM EST

My father tells a story of dissecting some mainframe code.  It was meticulously laid out, and documented in clear, loving detail.  It was a lovely bit of code, and wonderfully commented.

Except that all of the comments were in an obscure dialect of Hindi nobody there could read or understand.  The original programmer was long gone.  Ooops.
--
Karmakaze
[ Parent ]

Now that's job security!! {NT} (none / 0) (#133)
by clover_kicker on Thu Aug 14, 2003 at 12:51:26 PM EST


--
I am the very model of a K5 personality.
I intersperse obscenity with tedious banality.

[ Parent ]
I expect that's commonplace nowadays (nt) (none / 0) (#153)
by the on Thu Aug 14, 2003 at 05:40:58 PM EST



--
The Definite Article
[ Parent ]
Would you like some cheese with that whine? (3.69 / 13) (#88)
by Fon2d2 on Wed Aug 13, 2003 at 03:02:53 PM EST

First of all, I find myself somewhat miffed as to the point of this article. Does it exist just to poke fun at those ill experienced with programming, lament about the lack of professionality amongst programmers, or educate programmers on proper coding style? The article attempts to do all three and consequently it lacks direction. The article is also written with an attitude not suitable for a formal employer. If it were my code being berated in this article, I would not be happy. I find it difficult to believe that all the derision of this article is based on such a shortly and vaguely worded assignment. Judging from other comments, I am not the only one who had to read it multiple times to grasp what it was asking. If you want to write a worthwhile article, then make it an educational one. The college system does suck, and no amount of experience is ever going to fill in for that gap. I had never even heard of the ten commandments, and there were things in there I was completely unaware of: the "lint" program, proper brace style, the fact that "NULL" should always be typecast, the fact that () does not equate to (void), etc. I would hate to have an application turned down because of a lack of knowledge that could be bridged by one friendly and informative statement. That's why not all employers are so stringent. They want to get the best potential, not the widest knowledge of trivial language details.

Actually, czth meant to write about paragraphs. (4.50 / 2) (#184)
by haflinger on Fri Aug 15, 2003 at 08:17:43 AM EST

You know, separating your k5 posts by popping in a <p> every once in a while. But then, he forgot about k5, and was at work one day, and thought "I was getting really riled up about something writing- and computer-related… hmmm, must have been these crazy coders!"

However, you really need to learn about paragraphs. They're nice.

Did people from the future send George Carlin back in time to save rusty and K5? - leviramsey
[ Parent ]

Oh that we could be that choosy... (none / 0) (#89)
by meaningless pseudonym on Wed Aug 13, 2003 at 03:23:27 PM EST

Last time I was involved with this sort of thing, there only ended up being one candidate who we liked  sufficiently to even bother testing them. He then provided a dead-end solution that we had to debug to test.

He was hired.

All of these woes could have been eliminated... (3.41 / 12) (#91)
by egg troll on Wed Aug 13, 2003 at 05:55:37 PM EST

...had you chosen to use Python instead. Better luck next time!

He's a bondage fan, a gastronome, a sensualist
Unparalleled for sinister lasciviousness.

pfft (2.25 / 3) (#94)
by reklaw on Wed Aug 13, 2003 at 07:47:49 PM EST

This nonsense is why I use VB6.
-
Comments (4.00 / 2) (#95)
by The Solitaire on Wed Aug 13, 2003 at 08:59:23 PM EST

I wonder if the overzealous commenting might come from the fact that the code they are producing is part of a "test". Many programmers know that it is important to comment, but might not be completely clear on what constitutes the right amount of commenting in a production context. In the absence of that information, they likely err on the side of caution, and overcomment their work. Sure, they could ask, but this might violate the concept of a test. In other words, they might think that asking a stupid question will make them look like a clueless noob.

I expect that this kind of problem would be especially true for programmers that are fresh out of university, for two reasons. First, the kind of mindless programming tasks in university have little to do with the real world. Also, the university environment might make people unwilling to violate the concept of a "test", by asking for more information.

Second, specifically with respect to comments, as a university CS student, you're constantly nailed over the head with "comments are important!!", but never really get good instruction on when comments are useful, and when you're just going overboard. Having been a TA/Marker for CS courses many times in the past, I could give you hordes of examples of truly bizarre comment behaviour, that often never gets corrected (it does when I come across it, but I have a feeling other markers ignore it).

Just to give you the flavour of what I mean, I've seen code that produces comments like

int i = 1; // set integer i to 1

but entirely fails to comment really important algorithmic details which are really hairy to follow just by examining the code.

In conclusion, if you didn't in the original assignment, you might want to make sure to explicitly tell applicants that they will not be penalized for asking reasonable questions.

I need a new sig.

Overcommenting and reasonable enquiries (none / 0) (#96)
by czth on Wed Aug 13, 2003 at 09:12:23 PM EST

I wonder if the overzealous commenting might come from the fact that the code they are producing is part of a "test".

Very few of my comment complaints (in fact, I don't think any) were about overcommenting. Just using overly fancy styles. There was one couple (yep) that was downright garrulous in their commenting - that was the 85% comments one - but most were too little or just ugly style. You might get an idea of what I mean by "garrulous" if you looked at a site which, while I wouldn't want to divulge any names or anything, might be relevant (and scary).

I've been a TA in the past too, and have seen some of the bizzareness. I think as a programmer develops (har har no pun intended) they come up with their own commenting and coding style, and as long as it's defensible, that is, they can justify why for important elements of it - even if just to themselves - it's probably OK. It's the clone-and-hackers I worry about.

In conclusion, if you didn't in the original assignment, you might want to make sure to explicitly tell applicants that they will not be penalized for asking reasonable questions.

Nah, they're big boys and girls and should be able to figure that out :>. Most of our applicants have several years experience - I don't think we've considered anyone fresh out of school (perhaps we should...).

czth

[ Parent ]

Oh dear lord... (none / 0) (#99)
by The Solitaire on Wed Aug 13, 2003 at 09:36:42 PM EST

That website is positively disturbing. Do they really think that design like that is gonna help find them meaningful employment?? I am desparately hoping that this site is a joke.

I need a new sig.
[ Parent ]

That's what we thought (none / 0) (#100)
by czth on Wed Aug 13, 2003 at 09:52:53 PM EST

Apparantly it's for real, though. (Shudder.)

And seemingly from their info on there they do get work... so maybe if you're old, stable (read: married and lived somewhere for a while), have a company (S-corp., sole proprietorship, whatever) in your name, and can buzz the buzzwords, you're in....

czth

[ Parent ]

techheads (4.00 / 1) (#110)
by cypherpunks on Thu Aug 14, 2003 at 08:14:51 AM EST

This page has been visited 711 times since March 14, 2000
(57 times today, 72 times this week, and 76 times this month).

popular website, too ... I hope k5 does something for their popularity. I wonder if I could do anything to help.

telephone goats larrikin fnord cajoling.

[ Parent ]

Good hiring practice. (4.00 / 1) (#101)
by ghjm on Wed Aug 13, 2003 at 10:15:03 PM EST

It's relatively easy to set a simple test like this for a programmer - but what are you going to do for, e.g., a sysadmin?

-Graham

I don't hire sysadmins (none / 0) (#102)
by czth on Wed Aug 13, 2003 at 10:41:29 PM EST

(To be honest I don't hire anybody, I just make recommendations and do technical screening and interviews if people get that far. Of course at the interviews I always bring a pencil and sharpen it with a penknife while I'm waiting for answers... ;)

I'd come up with something if necessary, but it would be harder to ask for something remote (but we could set up a system and say "make it secure" or something like that, narrowed down a little perhaps!).

czth

[ Parent ]

from a previous comment (4.00 / 1) (#103)
by clover_kicker on Wed Aug 13, 2003 at 10:53:53 PM EST

Cribbed from here.
At one job interview I was handed a marker, sent to the whiteboard, and told to "Draw and explain to me the last network you worked on, in as much detail as possible. We'll tell you when to stop."

In a similar vein, I heard about an interview question "Please explain exactly what happens when you type 'ping kuro5hin.org' in as much detail as you can. We'll tell you when to stop."


--
I am the very model of a K5 personality.
I intersperse obscenity with tedious banality.

[ Parent ]
Simple test (none / 0) (#122)
by CaptainSuperBoy on Thu Aug 14, 2003 at 10:37:14 AM EST

If you want a test that will evaluate a sysadmin, just bring a small animal to the interview. If the prospective sysadmin tries to torture the animal, that's your guy.

--
jimmysquid.com - I take pictures.
[ Parent ]
Sysadmin skills. (none / 0) (#185)
by haflinger on Fri Aug 15, 2003 at 08:59:07 AM EST

Fundamentally, sysadmins are a strange, modified breed of programmer. (I've done sysadmin work. I greatly prefer it to ordinary programming.)

We don't create original code, as a rule, which is why you're asking your question. (In some environments, we do, but we shouldn't.)

If code alteration is common in the environment (and it often is, sysadmins do a lot of customization) then a decent way to test I think would be to throw some Perl (or other common language) at the admin and say "change this to do X."

However, the most important things in sysadmin work are working with people, making them happy, and crisis management. So you're looking for people who never, ever panic, or get angry because of unreasonable demands. Icewater in the veins. So interviewing us is more like a normal interview: you're trying to get at personality traits, not so much the talent.

Did people from the future send George Carlin back in time to save rusty and K5? - leviramsey
[ Parent ]

Anger is useful as a sysadmin [nt] (none / 0) (#243)
by esrever on Mon Mar 08, 2004 at 03:47:17 AM EST



Audit NTFS permissions on Windows
[ Parent ]
sysadmins write scripts (none / 0) (#209)
by bolthole on Sat Aug 16, 2003 at 02:58:56 AM EST

or at least, good ones do.

So get em to write a short shellscript.

At MINIMUM, they should be able to do simple one-liners, or the old "how do you find all files in a diretory tree that have 'xxxx' in them" sort of thing.


[ Parent ]

Ha ! (none / 0) (#105)
by bugmaster on Thu Aug 14, 2003 at 03:28:11 AM EST

You think I have it bad. I recently had to add some code to my old, old program, which was maintained by a few other people since I first wrote it. This is what I saw:
doFoo('A');
doBar('A');
if(baz) {
// a complicated formula involving 'A'
}
doFoo('B');
doBar('B');
if(baz) {
// a complicated formula involving 'B'
}
doFoo('C');
doBar('C');
if(baz) {
// a complicated formula involving 'C'
}
// ... 23 more copy-pasted blocks like that
You think I'm kidding ? I'm not. And then you complain about compressed struct allocations...

Anyways, sorry, just had to vent.


>|<*:=

Oops (none / 0) (#106)
by bugmaster on Thu Aug 14, 2003 at 03:29:29 AM EST

First line should say,
You think you have it bad.
Sorry about that. Freudian slip, or something.
>|<*:=
[ Parent ]
Worst part is... (1.00 / 1) (#113)
by porkchop_d_clown on Thu Aug 14, 2003 at 09:11:22 AM EST

I actually didn't notice - I just "read what you meant". When I read your correction, I had to go back and figure out what was wrong with the original...


--
His men will follow him anywhere, but only out of morbid curiousity.


[ Parent ]
Fun with variable names and comments (none / 0) (#107)
by vnsnes on Thu Aug 14, 2003 at 06:15:24 AM EST

I worked with a guy whose 5-year old code I had to adapt to another database backend and who had a propencity for not including any comments and naming his variables 't', 'tt', 'ttt', 'x', 'xx' and so on. These variables were meant to be temporary, but would invariably remain in the code. Because I was also working with him on new code that we were writing, I kept pestering him to include comments in his code so that 5 years down the road some poor shmuck would have an easier time to figure out what his strange variables mean. So one day I see a comment in a piece of code he wrote: "This is a comment for vnsnes" Expletives rained throughout the office that moment. Another time he asked me to take a look at his code because it just wasn't working and he was staring at it too long now. Well it turns out that he was using one of the 'tt' variables down in the problem section. But he'd already used 'tt' as a name in the upper part of the code. Solution? Append a third 't' to the variable name. Eureka!

pardon the lack of formatting. here's it is.... (none / 0) (#108)
by vnsnes on Thu Aug 14, 2003 at 06:17:53 AM EST

I worked with a guy whose 5-year old code I had to adapt to another database backend and who had a propencity for not including any comments and naming his variables 't', 'tt', 'ttt', 'x', 'xx' and so on. These variables were meant to be temporary, but would invariably remain in the code. Because I was also working with him on new code that we were writing, I kept pestering him to include comments in his code so that 5 years down the road some poor shmuck would have an easier time to figure out what his strange variables mean.

So one day I see a comment in a piece of code he wrote:

"This is a comment for vnsnes"

Expletives rained throughout the office that moment.

Another time he asked me to take a look at his code because it just wasn't working and he was staring at it too long now. Well it turns out that he was using one of the 'tt' variables down in the problem section. But he'd already used 'tt' as a name in the upper part of the code. Solution? Append a third 't' to the variable name. Eureka!

[ Parent ]

Don't throw stones in glass houses (3.33 / 3) (#109)
by htmltidy on Thu Aug 14, 2003 at 07:32:10 AM EST

The writer lamblasts all these code examples, which were written under a high pressure situation for a job interview, while showing his own inadequacies with simple html markup. The large block comment section beginning the article is formatted (poorly, the right edge markers are in disarray) with non-breaking spaces, resulting in a page-widening experience for anyone with a default font size much larger than "unreadable". Removal of the non-breaking spaces, SHORTENING of the long lines of comment characters (with no loss of content), and wrapping the segment in pre /pre tags results in a nicely formatted comment block, and an article that is readable without horizontal scrolling. Do it right the next time before you start throwing stones at the skills of others.

All the code looks good in my browser... (none / 0) (#112)
by porkchop_d_clown on Thu Aug 14, 2003 at 09:09:24 AM EST

What kind of weirdo fonts you usin'?


--
His men will follow him anywhere, but only out of morbid curiousity.


[ Parent ]
Font setting (none / 0) (#158)
by htmltidy on Thu Aug 14, 2003 at 06:54:09 PM EST

Default font (provided a page does not request a different font) is b&h-lucidabright, with a minimum font size of 16 to cover for those fools that use Microsoft html generators that like to include 'font size="1"' tags ('' subsituted for angle brackets) all over the place, resulting in a font so small it's unreadable.

[ Parent ]
Misconceptions (none / 0) (#115)
by czth on Thu Aug 14, 2003 at 09:37:59 AM EST

  1. Not that high pressure - they didn't have to write the code on site, and had pretty much all the time they wanted and could write from the comfort of their easy chair if they felt like it.

  2. The formatting, were you to have read the editorial comments, was worked at a fair bit. To start I tried non-breaking spaces but I had spell check on and for some reason it pointed them out as spelling errors, so I figured nbsps weren't allowed and started using dots (.). Then someone else told me to try nbsps and assured me they worked, and pointed me to this article. So I figured it was just the spell check and went back and regenerated my HTML to use nbsps instead of dots and all was well. I also don't think that pre is an allowed tag for stories. Sorry that it doesn't fit in your browser.
czth

[ Parent ]
How to do it (none / 0) (#119)
by epepke on Thu Aug 14, 2003 at 10:10:28 AM EST

I went through this fun when doing the diagrams for the Special Relativity article. Here's how to do it so that it works in a wide variety of browsers:

  1. Use <tt> to enclose multiline blocks of code. Do not use <code> except for single words and short phrases within blocks.
  2. Use   alternated with regular spaces for indentation of text. Long strings like      can cause problems with the word wrapper.
  3. Whatever you do, don't trust the text in the boxes when you move the article to voting. Always paste from your original source of HTML.

The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Agreed (none / 0) (#123)
by czth on Thu Aug 14, 2003 at 10:39:34 AM EST

Even your nbsp;s (missing ampersand intentional) in that comment decayed to real spaces. I wonder if this has been discussed on the Scoop site - I suppose if I get motivated enough I'll go look :>.

I end up doing the reverse to what you suggest: using code for blocks and tt for inline. Why do you suggest the reverse - not saying you're wrong, just wondering; is it just because it seems to work better that way in more browsers?

Indeed, I did find I had to break up big nbsp; arrays and replace every 6th or so with a real space. If I submit another article with code, I'll change my exporter program to do that automatically (the original was written using my own pH markup and then I wrote a perl script that used my (C++ XS) parser to output k5-suitable HTML).

czth

[ Parent ]

Why (none / 0) (#132)
by epepke on Thu Aug 14, 2003 at 12:51:04 PM EST

I end up doing the reverse to what you suggest: using code for blocks and tt for inline. Why do you suggest the reverse - not saying you're wrong, just wondering; is it just because it seems to work better that way in more browsers?

Yes, basically. I did the SR article originally with proper use of non-breaking spaces and <code>. It worked fine in all the browsers I'd used (IE, Safari, Opera, Netscape 6). However, I got complaints. So I changed it to <tt>, and everyone was happy.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Violated my own damn rules! (none / 0) (#131)
by epepke on Thu Aug 14, 2003 at 12:47:20 PM EST

Number two should read "Use &nbsp; alternated with regular spaces for indentation of text. Long strings like &nbsp;&nbsp;&nbsp;&nbsp; can cause problems with the word wrapper.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Misconceptions re. formatting (none / 0) (#160)
by htmltidy on Thu Aug 14, 2003 at 06:59:24 PM EST

I also don't think that pre is an allowed tag for stories.
It's clearly not allowed for comments, as the screen I'm looking at shows. I do not know about stories, but will grant the point that if it's not available, then it couldn't have been used.

However, the real problem resulted from simply blindly copying in an excessively wide, unbreakable, line of text. The bulk of the block comment was total whitespace, and a significant portion of the horizontal fill could have been deleted, without loss of the information content for the story itself. By deleting the extraneous horizontal content, a narrower unbreakable line results, with much less risk of causing horizontal scrolling in one's browser.

[ Parent ]

but he's still (mostly) right (none / 0) (#121)
by muyuubyou on Thu Aug 14, 2003 at 10:37:09 AM EST

I think he's a bit rigid with comments and variable naming conventions.

If Donald Knuth, Brian Kernigham, James Gosling, Larry Wall and Linus Torvalds to name a few, all use completely different comment styles and substantially different variable naming conventions, I wouldn't say my particular convention is right.


Variable names have to be visual and easy to understand to the programmer and not too horrid to the rest of the team or potential lurkers.


----------
It is when I struggle to be brief that I become obscure - Horace, Epistles
[ Parent ]
Exactly. Might I also add.... (none / 0) (#126)
by lb008d on Thu Aug 14, 2003 at 11:21:41 AM EST

Articles like this get my goat, especially when the author doesn't provide an example of their own competency when critiquing other people's code. Personally, I would have loved to see the author's own solution to the test problem in order to establish competency. That doesn't exist, so I went looking for code at his home page.

Here is where the article author's code is available. Since perl is by far my most familiar language, I thought I'd look around to find some example perl code. I looked here, here and many other places and kept getting 403 errors. Finally, this file contained some code I could look at.

The perl code in that archive (filter.pl and forward.pl) is hopefully not indicative of the author's best work. filter.pl does not use two of the most common perlisms (the angle input operator in a while loop and print with the default argument). Not to mention that the author espouses the use of whitespace, yet uses 2-space indentation throughout all of his code (at least it's consistently formatted). This may not be a problem in little programs like filter.pl and forward.pl, but in a file like libipt_CZTH.c from lines 389-450, a little more whitespace would be so nice. Finally, why doesn't the author use strict; in either perl program?.

It would be enlightening for us all, I'm sure, if the author would provide some code that solves his example problem and is indicative of his best effort.


[ Parent ]

Too many trees to see the forest? (none / 0) (#134)
by czth on Thu Aug 14, 2003 at 01:02:53 PM EST

You point to my code page but then don't bother looking at the first two things on the page - a (partial) operating system with a CVS browser to conveniently view the source, as well as a C compiler (page also has some links to some Java code for parsing a simple language).

What made you look at those three files anyway? Ah, I see, two are on the miscellaneous code page. I only put up the ipqueue code (I notice you didn't comment on the C/C++ - but it was scary kernel code after all) for someone on a local Linux User Group mailing list; I played around with it several years ago, used it for a fairly decent practical joke (think interception and alteration of web pages...) and abandoned it, having learned what I needed. The .pl files probably need to be set to have the right type in Apache - done (both in misc).

I happen to like 2-space indentation. That's a personal style issue (it's also standard in the code I work on which is convenient), and you're not required to like it.

Defense of such baseless accusations is a good incentive for me to post some of my most recent code, though; perhaps I will if I submit my article on my pH system.

I don't know about releasing my solution to the problem; it's code owned by my employer (i.e. actually being used - I didn't have to do the same exercise when I was hired, see article for when we started asking for The Task), and it would also make it too easy for applicants to find it. But here's some equivalent code you might have fun with: the code I was asked to write as my test before I started working here: Split.tar.gz (create a directory first, I should have in the archive but I didn't; I'm leaving it unchanged from when I first submitted it). Note: the code contains my first ever use of XS and Inline::C (i.e. first time interfacing C and Perl), which I learned for the test and which none of the other code even attempted (I was asked to benchmark the difference). I'm still pretty happy with it; note that the main perldoc is in Split.pm and isn't duplicated in the other files, so they only have implementation comments. One thing I would change is to perhaps use the Benchmark module in bench.pl, but I liked the output I generated too.

You also missed a ton of (non-perl) code on my old code page not to mention a (read-only) implementation of ext2fs (the standard Linux file system) in Turbo C++ (because that was the best I had when I managed to overwrite / and panic the kernel (or the other time when Partition Magic died on me while moving a partition) - VC++ makes it too hard to do low-level disk I/O on '9x).

czth

[ Parent ]

Ahem - perl only (5.00 / 1) (#140)
by lb008d on Thu Aug 14, 2003 at 02:13:35 PM EST

What made you look at those three files anyway?

I clearly stated in my original comment that I was only comfortable critiquing perl programs. The complier and OS code are in C or C++ - languages I'm familiar with and have written programs in, but I wouldn't consider myself qualified to thoroughly critique.

I notice you didn't comment on the C/C++ - but it was scary kernel code after all

I commented on the lack of well-used whitespace in one of those C files. PS, the tone of the article as well as comments like that don't add to your credibility. I have read the "nicer" version of the article and the tone is much more professional - an article that I think more people here would take seriously rather than a rant.

Thanks for the link to the Split XS code - it's easy to follow and understand. I still don't like the two-space indentation, however :-)

[ Parent ]

Actually, he "lambasts" them. (5.00 / 1) (#186)
by haflinger on Fri Aug 15, 2003 at 09:05:04 AM EST

Glass houses, indeed.

Did people from the future send George Carlin back in time to save rusty and K5? - leviramsey
[ Parent ]
Specify a little more (none / 0) (#116)
by hardburn on Thu Aug 14, 2003 at 09:39:54 AM EST

. . . the fields must be accessed as mn#(), which is handled by an AUTOLOAD - a nifty, but horribly slow mechanism . . .

When I read your problem statement an AUTOLOAD was the first thing that came to mind (with the subroutines named after the first three letters of each field). Yes, it's slow, but your problem statement doesn't say anything about speed--just that it be maintainable and OO. Most Perl programmers are going to write code to take less development time rather than CPU time, and an AUTOLOAD trick hit me as the easiest solution from a coding point of view. If the problem statement you linked to was all I had to go on, I wouldn't have worried about speed. I doubt I would have known that you were using Inline::C in your actual solution.

Also, remember that Perl OO isn't particularly fast in the first place. It's not so bad with a single class like this problem requires, but any signficant inheritance heirarchracy is going to slow it to a crawl.

That said, you could have AUTOLOAD generate a subroutine which is placed into the symbol table of the package and then call it. This means that your first call will be as slow as a normal AUTOLOAD, but subsequent calls are as cheep as a normal method call.

Better still is to dispense with AUTOLOAD all together and insert closures into the symbol table while you parse the data (see page 338 of the Camel for an example). This is probably how I'd end up doing it.


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


AUTOLOAD (none / 0) (#118)
by czth on Thu Aug 14, 2003 at 10:01:33 AM EST

The problem with using AUTOLOAD like you propose is - what do you do if a field doesn't exist? And, what do you do about duplicates (duplicate fields within the same or different segment; duplicate segment names; even field and segment names that are the same)? And what do you provide to access fields and segments sequentially?

You're right I shouldn't have harped on the speed since if AUTOLOAD did lead to a good solution and it was otherwise tidy speed wouldn't bother me too much (but if we did get to an interview I would ask questions like "how could you make this faster/more efficient" or "what tradeoffs did you make in writing this code?")

If you can deal with the uniqueness aspect then this (especially the closures/caching version) could be a good solution. If, say, you had a top-level message object, which had segment objects (like you said, probably slow but we're not worried about that at this point), then dynamically creating names would be fine, to be used somewhat like:

my $msg = AMF->new(parse => $text);
if(my $seg = $msg->s_fmlog()) {
  my $addr1 = $seg->f_AD1();
  my $city = $seg->f_CTY();
  ...
  or
  ...

  my ($addr1,$city) = $seg->find(qw[AD1 CTY]);
}

Where: I use s_ as a prefix so it doesn't conflict with other methods (which really doesn't make it much better than $msg->find('fmlog')), and such a function would advance to the next segment of that name and return it as an object, or return false if one wasn't found.

czth

[ Parent ]

Corrected corrections (4.00 / 1) (#124)
by des mots on Thu Aug 14, 2003 at 10:44:27 AM EST

char text_in[MAX_TEXT_IN];
/*...*/
sizeof(text_in)

That sizeof will never return a larger value than MAX_TEXT_IN-10.

In fact this sizeof will return MAX_TEXT_IN*sizeof(char) which is MAX_TEXT_IN because sizeof(char)==1 by definition. Note that a table is not altered to a pointer in this context.

#define SEGMENTEND '||'

but fortunately he never tried to use it. (Character constants can only be single characters, multiple characters need to be strings, in double quotes, and can't be treated the same at all.)

OK, this is ugly, but try to use it, you will just get a warning from the compiler because of a "multi-character character constant". It is a standard non-portable feature of C.



Multi-character character constants (none / 0) (#125)
by epepke on Thu Aug 14, 2003 at 11:02:50 AM EST

OK, this is ugly, but try to use it, you will just get a warning from the compiler because of a "multi-character character constant". It is a standard non-portable feature of C.

That's depending on the compiler. Compilers on the Mac tend to have this warning turned off because 4-character alphanumeric constants are useful for file types and application signatures.

They're also incredibly useful for Unicode. Use one-byte characters for ASCII and 4-byte characters for UTF-32. Unless you want to cheat and declare that you won't use the astral plane, in which case use 2-byte characters for UTF-16.

On the other hand, defining a 2-character constant '||' for this application still shows a rather egregious lack of understanding.

Way back in college, I once took a C programming course, and even my instructor was confused over the distinction between characters and strings, which I thought pretty obvious from the way arrays and pointers are treated. Fortunately, I was able to learn C simply by reading K&R while recovering from donating blood. This is probably symbolic of something, but I don't know what.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
multi-byte character constants (none / 0) (#167)
by horny smurf on Thu Aug 14, 2003 at 08:39:01 PM EST

gcc supports multi-byte character constants as well.

Pascal uses 'a' for a character constant and 'aa' for a string. It wouldn't surprise me if your instructor spoke pascal.

[ Parent ]

Pascal was the big language at the time (none / 0) (#177)
by epepke on Fri Aug 15, 2003 at 02:20:01 AM EST

On the other hand, the main teaching Pascal at the University was the original CDC Pascal. It didn't have strings per se at all. Rather, it had "alfa," which was a packed array of ten 6-bit characters (which just fits in a 60-bit CDC word). The 6-bit characters didn't have a case distinction, of course, so there was an "ASCII" mode that would use pairs of characters much like UTF-8 can use several characters, only uglier. It was pretty hideous, but that's the way it was.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Point (none / 0) (#129)
by czth on Thu Aug 14, 2003 at 11:54:07 AM EST

Should be "that sizeof expression will never return more than MAX_TEXT_IN-10" or better "that sizeof will never return more than MAX_TEXT_IN". Too late to edit here but I can fix it on the version on my site, thanks.

czth

[ Parent ]

sizeof array (none / 0) (#130)
by des mots on Thu Aug 14, 2003 at 12:20:33 PM EST

That sizeof expression will always return MAX_TEXT_IN. From the K&R C: The sizeof operator yields the number of bytes required to store an object of the type of its operand.[...]the size of an array of n elements is n times the size of one element.

Note that in ANSI C tables must have constant (known at compile time) size. If I remember well, GCC allows run-time sizing of arrays. I don't know how sizeof works with that.



[ Parent ]
It doesn't, sizeof is compile-time (none / 0) (#136)
by czth on Thu Aug 14, 2003 at 01:12:33 PM EST

And by "never return more than" I wasn't implying it would ever return less, either - just that even if an array was properly grown, you wouldn't be able to get the new size from sizeof. Not sure what you mean about GCC allowing run-time sizing? Perhaps this extension?

Also, why do you refer to arrays as tables?

czth

[ Parent ]

arrays (none / 0) (#145)
by des mots on Thu Aug 14, 2003 at 03:11:03 PM EST

Yes, this extension. And for "table" instead of array, I don't know, maybe because in French we say "table" or "tableau". BTW, do you have an example of really good answer to this exercice?

[ Parent ]
Answer to exercise (none / 0) (#154)
by czth on Thu Aug 14, 2003 at 05:48:34 PM EST

As I say toward the end of this comment, I hesitate to post our "current solution" because it's code owned by my employer and also I don't want to make available a solution to a task that we're still using.

However, to meet people half way on this, here's a solution but with minimal comments and no perldoc and probably more compact than it needs to be. I also use linear name lookups, which would be much better done with the aid of hashes: # AMF message parser and objects
# dbr 20030814

package AMF;

use strict;

# new(message_text) => return $self or error message
sub new {
  my ($class,$text) = @_;
  my $self = bless { data => [], seg => -1, fld => -1 }, ref $class || $class;

  for(split /\|\|/,$text) {
    my $name = $1 if s/^([^|]+)\|//;
    return "malformed segment, can't read name: '$text'" unless defined $name;
    my @flds;
    push @flds, [$1,$2] while s/^(...)([^|]*)(\||$)//;
    return "malformed '$name' segment, extra data: '$_'" if length;
    push @{$self->{data}}, { name => $name, field => \@flds };
  }

  return $self;
}

# reset() => go back to the start of the message
sub reset {
  my $self = shift;
  $self->{seg} = $self->{fld} = -1;
}

# reset_seg() => go back to the start of the current segment
sub reset_seg {
  my $self = shift;
  $self->{fld} = -1;
}

# next_seg() => return next name and advance to it | undef if at end
sub next_seg {
  my $self = shift;
  $self->{fld} = -1;
  return $self->{seg} >= $#{$self->{data}} ? undef :
   $self->{data}->[++$self->{seg}]->{name};
}

# find_seg(name) => return name and advance to next seg of that name | undef
sub find_seg {
  my ($self,$name) = @_;
  my (@pos,$next) = @{$self}{qw(seg fld)};
  while(defined($next = $self->next_seg())) {
    last if $next eq $name;
  }
  @{$self}{qw(seg fld)} = @pos unless defined $next;
  return $next;
}

# next_fld() => return (name,value) (ref in scalar context) | undef if at end
sub next_fld {
  my $self = shift;
  return undef if $self->{seg} < 0;
  my $fld = $self->{data}->[$self->{seg}]->{field};
  return () if $self->{fld} >= $#{$fld};
  $fld = $fld->[++$self->{fld}];
  return wantarray ? @{$fld} : $fld;
}

# find_fld(name) => return value of next field in segment of that name | undef
sub find_fld {
  my ($self,$name) = @_;
  my ($pos,$next) = $self->{fld};
  while(defined($next = $self->next_fld())) {
    last if $next->[0] eq $name;
  }
  if(defined $next) {
    return $next->[1];
  } else {
    $self->{fld} = $pos;
    return undef;
  }
}


1;

Make of that what you will, I just wrote it. Commenting is intentionally minimal. We'd be happy with it as a response to our exercise except for the lack of perldoc and paucity of comments but like I said I made 'em thin here on purpose. The linear search is slow but making it faster would require making it more complex so the applicant is safe with this but should be able to answer questions like "how would you improve the efficiency if X" (e.g. "if messages typically have thousands of fields and segments and we often want to find just a few", or "if we didn't care about order [we do] or didn't allow duplicates [we do], how could you improve the general efficiency" (big hint to use a hash)).

czth

[ Parent ]

OK (none / 0) (#189)
by lb008d on Fri Aug 15, 2003 at 10:49:05 AM EST

This is what I wrote shortly before writing this comment. As you can see, we have different ideas about what the functionality of the code should be - for instance, I figured duplicates didn't happen, but since you mentioned later that they could, I would change the data area for a field to an array reference. Data would be pushed on as it was seen in the message, thus preserving order. Using hashes to store the segments doesn't preserve order either, but that could be changed as well.

Our ideas of iterating over the messages are different as well. Perhaps I should have included the ability to rewind the iterator by using a counter rather than shift() to iterate over the array.

I think I got the formatting right, but this is my first code posting attempt here.

package AMFParser;

require 5.8.0;

use strict;
use Carp qw( carp );
our $VERSION;

$VERSION = '0.1';

sub new
{
    my $invocant = shift;
    my $obj = bless( {}, ref $invocant || $invocant );

    # Optional argument is msg to parse
    if ( scalar @_ == 1 )
    {
        unless ( $obj->parse( shift ) == 1 )
        {
            return undef;
        }
    }
    return $obj;
}

sub parse
{
    my $self = shift;
    my $amf_string = shift;

    unless ( $amf_string =~ /\|\|\z/ )
    {
        carp "AMF message is not well-formed.";
        return 0;
    }

    my (@segments) = split /\|\|/, $amf_string;
    for my $segment (@segments)
    {
        my ($segment_name, @flds) = split /\|/, $segment;
        next unless defined $segment_name;
        for my $fld (@flds)
        {
            my $name = substr $fld, 0, 3;
            my $data = substr $fld, 3;
            $self->{__PACKAGE__ . '::segments'}->{$segment_name}->{$name} = $data;
        }
    }
    return 1;
}

sub message_iterator
{
    my $self = shift;

    my @segments =  keys %{ $self->{__PACKAGE__ . '::segments'} };
    my $itr_sub = sub {
        unless ( @segments )
        {
            return undef;
        }
        my $retval = [ $segments[0], $self->segment_iterator($segments[0]) ];
        shift @segments;
        return $retval;
    };
}

sub segment_iterator
{
    my $self = shift;
    my $segment = shift;

    unless (defined $self->{__PACKAGE__ . '::segments'}->{$segment})
    {
        carp "Can't create iterator for segment $segment. Segment does not exist\n";
        return undef;
    }
    my @flds = keys %{ $self->{__PACKAGE__ . '::segments'}->{$segment} };
    my $itr_sub = sub {
        unless ( @flds )
        {
            return undef;
        }
        my $retval = [ $flds[0], $self->{__PACKAGE__ . '::segments'}->{$segment}->{$flds[0]} ];
        shift @flds;
        return $retval;
    };
}

sub get_segment
{
    my $self = shift;
    my $segment = shift;
    unless (defined $self->{__PACKAGE__ . '::segments'}->{$segment})
    {
        carp "Can't find segment $segment. Segment does not exist\n";
        return undef;
    }
    return $self->{__PACKAGE__ . '::segments'}->{$segment};
}

sub get_field
{
    my $self = shift;
    my $field = shift;
    for my $segment ( keys %{ $self->{__PACKAGE__ . '::segments'} } )
    {
        if (exists $self->{__PACKAGE__ . '::segments'}->{$segment}->{$field} )
        {
            return $self->{__PACKAGE__ . '::segments'}->{$segment}->{$field};
        }
    }
    return undef;
}

1;

Test program:


use strict;
use AMFParser;

my $msg = 'fmlog|OIDHICAD1|LCLBASRC|RIDHILEX|TSP1031348701||IACN|TYPL|PROCHHCH|LLL1A|BKCA* *NGH|SOUAGENT||txnorg|OIDHICAD1|LCLBASRC|TRM10.191.0.25|TIPAMFSAM|RIDHILEX|SIDG8 600||addr1|TYPH|STR4554 Eleventh Street|APT|STANY|ZIP07440-1913|COUUS||addr2|TYPB|STR202 Fifth Avenue|APT12|STANY|ZIP07410-1001|COUUS||';

my $obj = AMFParser->new($msg);
# $obj->parse($msg);

my $msgitr = $obj->message_iterator();

while ( my $segment = $msgitr->() )
{
    print "Segment $segment->[0]\n";
    my $segmentitr = $segment->[1];
    while ( my $fld = $segmentitr->() )
    {
        print "\tField: $fld->[0] Data: $fld->[1]\n";
    }
}

my $data = $obj->get_field('LCL');
print "LCL: $data\n";


[ Parent ]

Issues (none / 0) (#198)
by czth on Fri Aug 15, 2003 at 03:08:33 PM EST

We would have a few issues with that; I'm not going to go through the code like I did the C++ version that was posted (if I critiqued every solution I'd run out of time and people would keep thinking I'm an arrogant know-it-all :P). I like the iterators - we use something like that in our database objects except to make people feel warm and fuzzy it's actually a blessed coderef so people can call a 'Next' method on it (where Next is just sub Next { $self->() }). In theory __PACKAGE__.'::name' is nice but in practice it's messy and I'd prefer to see a hand-written prefix like $self->{_amf_name}. And of course the order/duplicates you mentioned already - and lack of comments.

czth

[ Parent ]

Ah but you said extensible... (none / 0) (#218)
by lb008d on Sat Aug 16, 2003 at 06:41:53 PM EST

which I took to partially mean "inheritable". The __PACKAGE__ . '::name' syntax facilitates inheritance of data members, since perl provides no built-in method to protect against name clashes. And, since the user of the module is never exposed to it, it won't affect them. How much it affects the developer is up for question.

[ Parent ]
why did you write a judgemental article (none / 0) (#240)
by phred on Wed Aug 20, 2003 at 04:57:01 PM EST

then post a bad solution (no doc, linear arrays). Are you saying you can't pass your own article criteria?

[ Parent ]
Actually, it does (5.00 / 1) (#163)
by ZorbaTHut on Thu Aug 14, 2003 at 08:21:21 PM EST

I tested this out a while back and determined that sizeof() wasn't compile-time when you were dealing with run-time sized arrays. Headache-inducing and irritating, but that's the way it works.

#include <iostream> int main() { int x; std::cin >> x; int arr[ x ]; std::cout << sizeof( arr ) << std::endl; };

[zorba@localhost zorba]$ ./a.out
12
48
[zorba@localhost zorba]$ ./a.out
44
176
[zorba@localhost zorba]$ ./a.out
0
0
[zorba@localhost zorba]$ ./a.out
1
4
[zorba@localhost zorba]$ ./a.out
-1
4294967292

I especially like that last one :P

[ Parent ]

Interesting (and long time...) (none / 0) (#164)
by czth on Thu Aug 14, 2003 at 08:24:45 PM EST

But runtime-sized arrays are a GCC extension, right - so it's probably safe to say that it's still compile time for standard C.

Long time, how goes?

czth

[ Parent ]

Extensions (none / 0) (#171)
by ZorbaTHut on Thu Aug 14, 2003 at 10:55:47 PM EST

Actually, runtime-sized arrays are part of C99. GCC allows C++ code to use them also - *that's* the extension - so they'll definitely be runtime in standard C as well.

Then again, GCC also doesn't properly implement C99 runtime-sized arrays, and I can't find *anywhere* what the implementation should be :P So take that with a grain of salt. GCC will do runtime sizeof() for dynamic arrays no matter what - whether that's what it should do is another matter altogether.

And it goes. Life is annoying busy, and bugs are irritating. The usual.

[ Parent ]

A kinder, gentler version (4.66 / 3) (#127)
by czth on Thu Aug 14, 2003 at 11:32:34 AM EST

Several people have, in various comments, objected to my style and perceived attitude in this article, either considering it arrogant, or me a big meanie that can't/won't take the time to educate people on what's right, etc. I take none of it personally; everyone gets to rant now and then and sometimes they even get their rants posted.

That said, it's hard to maintain full rantitude, especially since the original article was written almost a week ago. I acknowledge that some of the comments were right and it might have been more helpful to take a different approach - a teaching approach rather than that of finding fault (or even entertainment). So, as promised, I did some revision and I now present

     Common Programming Errors, and How To Overcome Them

The content is much the same (some expansions, some deletions), but this time I try more to provide good alternative examples and be less harsh toward those unfortunate enough to have grown up with a TV.

czth

Yuck! (4.00 / 1) (#157)
by htmltidy on Thu Aug 14, 2003 at 06:49:36 PM EST

An even more egregious example of bad practices in html/css design. Note, in http://i4031.net/article/style/dark.css you have this: BODY { font-family: arial, sans-serif; color: white; background: black; } You should do one of two things. Preferably, delete "color: white; background: black;", or provide an alternate stylesheet that omits those two lines. If you want white text on a black background, set your browser defaults to give you white text on a black background. But DONT ask my browser to give me white text on a black background.

[ Parent ]
Oh _come_ _on_ (none / 0) (#162)
by czth on Thu Aug 14, 2003 at 07:29:05 PM EST

You can write your own stylesheet and override elements you want if you have a decent browser. Nothing wrong with me picking colours I like for a page. That goes beyond nit-picking into silliness. Go after something truly evil, like Flash-only sites with no alternative provided.

czth

[ Parent ]

Ah... Myopia is such a wonderful thing (not) (none / 0) (#166)
by htmltidy on Thu Aug 14, 2003 at 08:31:46 PM EST

You can write your own stylesheet and override elements you want if you have a decent browser.
You are correct, and I already do that, to attack all the other folks who think it's a good idea to farm out "background: white;" or bgcolor="#ffffff" (which is by far the most widely seen color farmed out for page backgrounds).
Nothing wrong with me picking colours I like for a page.
I agree, there is nothing wrong with you picking a color you like for your page. My dispute is when you forget to look past your little section of the world (note the myopia reference in the subject line) and presume that I want to see your page in the colors you like. You should not be farming out your personal choice of foreground/background colors on others. If you like white text on black backgrounds, then by all means, setup your own style sheet overrides and browser defaults to give you white text on black backgrounds. But don't presume that the rest of the world wants to see white on black by sending it out to other folks browsers.

And if you still find you must provide others with your color choices, at least provide an alternate style sheet for the rest of us who don't want to look at your color choices. Again, any decent modern browser also supports selection of alternate style sheets as well. It's one more LINK tag, and one more style sheet on your server. Not a significant amount of effort.

That goes beyond nit-picking into silliness.
Not nit-picking, not silliness, just another example of poor practices on the web. The assumption that that the rest of the world wants to see your personal choice of foreground/background colors.
Go after something truly evil, like Flash-only sites with no alternative provided.
Granted, yes, much more evil than "background: black; color: white;", but even so, that does not detract from the poorness of assuming the rest of the world likes or even wants to see your color choices for foreground/backgrounds on your page.

[ Parent ]
Good god. (none / 0) (#183)
by haflinger on Fri Aug 15, 2003 at 08:11:31 AM EST

You're objecting to this page on the grounds that it includes valid CSS?

Man, and it's not like it doesn't have a few flaws. Check this. (It lacks a DOCTYPE, for starters.)

But then, you're posting on k5, which is much, much worse.

Did people from the future send George Carlin back in time to save rusty and K5? - leviramsey
[ Parent ]

Wha? (5.00 / 1) (#195)
by Politburo on Fri Aug 15, 2003 at 01:46:40 PM EST

I agree, there is nothing wrong with you picking a color you like for your page. My dispute is when you forget to look past your little section of the world...and presume that I want to see your page in the colors you like.

Are you implying that all web pages should include no color specification and that each end-user should configure it to their own desires? Or that each web designer should ask their users if they agree with the color selection? Or do we need a global list of colors you like to make our web pages in?

[ Parent ]
academic work (none / 0) (#239)
by phred on Wed Aug 20, 2003 at 04:53:25 PM EST

is typically black type on white background because it emulates the written page. Hacker/cracker pages (like yours) are typically light type on black pages because they look eleet. Trouble is, at least the hackers use an attractive font and colors to highlight stuff. Your page, although instructive, lost serious points on styling. This from somebody being judgemental about written information (code) is sort of a contradiction, but everybody seems to be an expert nowadays so I'm just mentioning this in passing.

[ Parent ]
Me too! (none / 0) (#181)
by mozmozmoz on Fri Aug 15, 2003 at 07:35:26 AM EST

I am torn between rating the czth comment 5 because he's responded to feedback and apologised, or rating it 0 because the page is so unreadable. Instead I'll post this.

Having to impose my style sheet on the pages is just an ugly solution. I did it for a while, but few browsers allow you to just put a button somewhere to toggle between forcing the sure style sheet and not, and if you always force a lot of pages become unreadable, so your local sheet grows as you force more and more elements. In the end it's easier to just use lynx and be done with it.

There's lots of comedy on TV too. Does that make children funnier?
[ Parent ]

my dear god. (2.00 / 7) (#128)
by rmg on Thu Aug 14, 2003 at 11:48:14 AM EST

this article made me want to wretch.

the writing is marginal. you are not clever enough to write this article properly. sorry.

it is so steeped with IT/"geek" mannerism it made me ill. i got that feeling you get when you walk into a seedy computer store... but not the good kind. the kind that sells old crap, but uses words like "solutions" in its ads.

this article is why i would never be a computer janitor like you. not because i would be surrounded by idiots. that is all but unavoidable. it is because the people who believe they are not idiots would talk like this.

i used to think articles about politics are lame. but now that i see a "technology" article like this one, i see them in an entirely different light. personally, i hope to see a million articles about mogadishu.

_____ intellectual tiddlywinks

You poor wretch (4.00 / 1) (#135)
by czth on Thu Aug 14, 2003 at 01:07:52 PM EST

<spelling-nazi>
You poor wretch, I think you meant retch.
</spelling-nazi>

Have a wretchedly trollicious day.

czth

[ Parent ]

while i am often accused of trolling... (3.50 / 2) (#137)
by rmg on Thu Aug 14, 2003 at 01:29:42 PM EST

that post was completely sincere.

and as regards the spelling, i have a lot of comments to post, and not a lot of regard for my readers. i cannot be bothered to catch every typo i make.

good day.

_____ intellectual tiddlywinks
[ Parent ]

Constructive specifics (3.00 / 1) (#142)
by czth on Thu Aug 14, 2003 at 02:23:37 PM EST

would be nice. I don't really care much about it being 'steeped with IT/"geek" mannerism'; that's quite intentional amd it's nowhere near as bad as writing I've seen that looks like someone spilled the Jargon File on it.

Your "computer janitor" term is interesting too - obviously meant to be insulting, but do you have a point beyond that? What, if I may ask, do you do for a living?

czth

[ Parent ]

my point (none / 0) (#146)
by rmg on Thu Aug 14, 2003 at 03:12:07 PM EST

is that "computer professionals" do a job that high school students could perform. in fact, i had two friends in high school who had real jobs in IT. one was a programmer, one was a systems administrator.

at the same time, they build a whole community, which they are pretensious enough to call "geek culture," around the idea that they possess some vast knowledge or superior intelligence that seperates them from the "masses." i.e. their mom, their grandparents, management, marketing, etc.

more than that, the ones on this site and others will claim to be "philosophers" and "scientists," and whatever else, believing themselves emminently qualified in all matters of science and engineering. this has nothing to do with your article in particular, but it does inform the background from which it springs.

all of these things are distilled and given their most "eloquent" expression in the works of esr. the jargon file, the cathedral and the bazaar, etc, etc, etc. he prattles on about sociology as if he is qualified to make any serious commentary on the subject. he talks about eastern religion. the list goes on. it is typical of the kind of pretensious claptrap that esr and "geek culture" have made appear acceptable to the "geek masses."

but let's back up a bit. what sorts of jobs can be performed by mere high school graduates, or even those who have not yet graduated. what sorts of jobs do such applicants garner? retail, certainly. food service. service industry in general. but let's not forget the information technology industry.

hence computer janitor.

derogatory, yes. harrowingly close to the truth? absolutely.

and what could the computer professionals of our world need more than an appropriate epithet to deflate their gigantic collective ego?

but to address your specific article, my complaint is precisely the high-handed, "geek" language and the generally condescending tone it carries, and more than that, the fact that it was so readily snatched up by the voters of this site. anything that makes them feel smart is front page or at least section material.

_____ intellectual tiddlywinks
[ Parent ]

I absolve you of your mediocrity (none / 0) (#149)
by czth on Thu Aug 14, 2003 at 03:59:10 PM EST

my point is that "computer professionals" do a job that high school students could perform. in fact, i had two friends in high school who had real jobs in IT. one was a programmer, one was a systems administrator.

And I know a 14-year-old with 3 Ph.D's, what's your point? For every profession except those that legally require some sort of certification, there are people that learn enough on their own to do it without "higher eduation." And likewise there are simple positions in each profession which even an untalented hack can do, or can fool people into thinking he's doing. It's not particular to IT but because of the late bubble it's more obvious.

Yes, there's pretentiousness. But what exactly makes one qualified to comment on sociology, anyway? Only a degree in Sociology? A Ph.D? By virtue of some people's unique position they may be able to make astute observations about a field that "professionals" cannot. Do they/we carry it too far sometimes? Sure. Is there an identifiable "culture"? Definitely, but probably not more so than in any other speciality.

derogatory, yes. harrowingly close to the truth? absolutely.

Sometimes. Not always, or even all that often. Can the average highschooler work in IT? (Heck, we're lucky if the average highschooler can tie his own shoes.) Maybe a scripted help desk job. I dare say there are people that could do accounting for small businesses without going to school for it (if it was legal); and that's why some people are pushing for "IT" certifications, while others vehemently oppose it.

Now if you excuse me I have to wax the floor in my homestead in the GNU/noosphere... wax on, wax off....

czth

[ Parent ]

a typical response... (5.00 / 1) (#200)
by rmg on Fri Aug 15, 2003 at 03:44:29 PM EST

14 year-olds with phd's have nothing to do with this. these kids were above average, yes, but neither of them was exceptional. i was smarter than both of them. they had experience, that is all.

this response completely dodges my points. i did not say that there is pretensiousness. i said that the "culture" is based on pretensiousness.

why do i even bother...

_____ intellectual tiddlywinks
[ Parent ]

Whatever happened to your 'shift' key? (2.00 / 1) (#143)
by muyuubyou on Thu Aug 14, 2003 at 02:47:52 PM EST

You know, capitals.


----------
It is when I struggle to be brief that I become obscure - Horace, Epistles
[ Parent ]
a close reading will reveal... (3.00 / 3) (#144)
by rmg on Thu Aug 14, 2003 at 02:54:17 PM EST

that i do use capitals in this post.

HTH. HAND.

_____ intellectual tiddlywinks
[ Parent ]

Problems with the problem (4.00 / 1) (#138)
by Smerdy on Thu Aug 14, 2003 at 01:36:48 PM EST

I certainly agree that 90% of self-described "programmers" are unqualified as developers. However, this problem statement makes me wonder about your qualifications and/or the nature of your company.

You seem to confuse object-orientation with data encapsulation. Object-orientation is often a very clumsy way to do what is done much more sanely with abstract types in ML and other language families.

The problem you pose is also trivial. It should take any skilled programmer 10 minutes to write an efficient and easily readable solution. The only conclusion I can draw is that the real problems solved by your company are trivial enough that this is an adequate test of ability.

Encapsulation and triviality (4.00 / 1) (#141)
by czth on Thu Aug 14, 2003 at 02:20:10 PM EST

You seem to confuse object-orientation with data encapsulation. Object-orientation is often a very clumsy way to do what is done much more sanely with abstract types in ML and other language families.

Yes, you're right; I mention elsewhere that we're really only going after the encapsulation aspect of OO here even though we use other aspects in "real" code.

The problem you pose is also trivial. It should take any skilled programmer 10 minutes to write an efficient and easily readable solution. The only conclusion I can draw is that the real problems solved by your company are trivial enough that this is an adequate test of ability.

That is true (should being the operative word, maybe a little more time for testing and then to write some tests to demonstrate to us that it works), but the conclusion is wrong: developers applying for a position don't have days to spend writing a "test" program (plus combine that with how long it actually does take people rather than how long it should take and you'll see our problem). Of course we could use a harder problem and say "if you think this is too hard or will take you longer than X hours then please for your own sake don't bother" but I doubt if that would fly well.

I looked at your ML links a bit, looks like an interesting enough language, but I suspect that like many MFTLs it doesn't have sufficient library support (e.g.: {Informix,Oracle,MySQL,Sybase,ODBC} database drivers? regular expression engine?) or speed to be used in the Real World. I'd also like to be able to use Scheme (either at work or at home) but don't for the same reasons.

czth

[ Parent ]

ML (none / 0) (#151)
by Smerdy on Thu Aug 14, 2003 at 04:08:55 PM EST

I looked at your ML links a bit, looks like an interesting enough language, but I suspect that like many MFTLs it doesn't have sufficient library support (e.g.: {Informix,Oracle,MySQL,Sybase,ODBC} database drivers?

All of the popular ML compilers have C foreign function interfaces, meaning that most C libraries are usable with a little bit of wrapper code. You can look at my miniscule PostgreSQL interface for an example. Most of that file implements not strictly necessary features that make SQL interaction in a semi-functional style easier. The rest of the code in that part of the CVS repository is generated by an automated tool that comes with SML/NJ.

regular expression engine?

All of the major ML compilers come with regular expression libraries.

or speed to be used in the Real World.

On the contrary, ML is one of the best suited language families around for efficient compilation. With the equivalent amounts of industrial time and money investment in compiler development to what C has received, there would be no way to argue against it in terms of speed. It certainly beats the socks off of Perl, and the best ML compilers tend to beat g++ soundly for speed of many programs.

For slight corroboration, see the results of the Great Programming Language Shootout. While I don't claim that this study was especially scientific, seeing tiny ML programs compiled to code roughly as fast as equivalent bulky C programs averaged over 25 tests ought to at least make you stop and think before making another offhand comment about speed.

More information on ML

[ Parent ]

Estimating task time (5.00 / 1) (#182)
by mozmozmoz on Fri Aug 15, 2003 at 07:53:51 AM EST

plus combine that with how long it actually does take people rather than how long it should take and you'll see our problem

One thing I used to like doing was stopping people at the end of their estimated time and asking how far through their task they were. Most developers are off by 2x to 10x as soon as they step even slightly outside their experience. Inside their domain the error and uncertainty both drop a bit, but it's IME very rare to get accurate estimates from coders. If you'd said 'it took me...' I'd be less skeptical. Curious: how long did it actually take you?

I looked at the problem, sketched a solution in Delphi (what I'm familiar with), and decided that idle curiosity about the problem wasn't enough to make me spend time building the solution and convincing myself that it worked. Time taken was about 10 minutes. I guess over an hour to get something that actually worked, of which at least 10 minutes would be time taken to think up test cases, and 10 minutes to understand the problem (which I spent reading kuro5hin to get the same result).

I suspect I could probably do it in Perl more concisely, but my Perl skills are limited. ML I want to look at more.

Moz

There's lots of comedy on TV too. Does that make children funnier?
[ Parent ]

Hey! Lay off ACE! (none / 0) (#139)
by avdi on Thu Aug 14, 2003 at 01:37:50 PM EST

Great, great article.  It boggles my mind that these people actually apply for jobs.  However...

IMHO, ACE is up there with boost as one of the most important libraries in existance for the C++ programmer trying to get something done in the real world.  From it's very complete OS wrappers, to it's frameworks for implementing higher-level multithreaded distributed programming patterns, it's the only way I know of to write multithreaded, network-aware C++ code which a) is high-performance; b) compiles on every platform under the sun; and c) doesn't get bogged down in tedious generic infrastructure.

I use it where I work, writing networking code for an air-traffic control system which consists of a number of applications on various platforms.  ACE has been a tremendous timesaver.  It's true it's quite the behemoth, but none of it's wasted space.

--
Now leave us, and take your fish with you. - Faramir

ACE (none / 0) (#150)
by czth on Thu Aug 14, 2003 at 04:01:22 PM EST

I thought I might find someone that objected to my criticism of ACE. It's probably gotten better, but I find it way too heavyhanded (the redefinition of main is one example - most toolkits just tell/ask you to call init/finalize functions). And yes, Boost is nice.

czth

[ Parent ]

On your Pascal Link... (4.50 / 2) (#147)
by Canar on Thu Aug 14, 2003 at 03:19:24 PM EST

In Delphi, which uses Borland's Object Pascal, almost every single complaint the author has for "Why Pascal Is Not His Favorite Programming Language" has been fixed. Object Pascal is now nearly as robust a language as C++, and is several orders of magnitude easier to properly interpret.

Delphi (4.00 / 1) (#148)
by czth on Thu Aug 14, 2003 at 03:44:50 PM EST

I used Delphi in 1997, boss at work wanted me to use it so I did, and then I installed it at home and did a fair bit of work (IRC client with plugin DLL support, web/telnet server with ISAPI support) with it. It was probably my first extensive exposure to WINAPI programming, fortunately it translated well to Visual C++. I liked it well enough (especially the RAD tools) but I prefer C++.

czth

[ Parent ]

I quickly banged out "the task" (3.00 / 3) (#156)
by enkidu on Thu Aug 14, 2003 at 06:44:08 PM EST

Plain text mode doesn't seem to mangle it.

/**
 * This program is intended to parse AMF (ASCII Message Format) messages.  It
 * provides the working base for a query mechanism by storing the parsed
 * messages in an array ordered by the original message, but with maps to
 * allow quick searching.  Field lengths are read in and parsed using
 * std::string and stream.
 *
 * A message is an ASCII string of segments each ending with a double-pipe
 * ("||").  A segment consists of one or more pipe ("|") separated strings;
 * the first string is the segment's name.  The remaining strings are fields;
 * the first 3 characters are the field name, the rest are the data.  You may
 * assume that the input data is valid.
 *
 * Here are the assumptions made in this program and how to fix them if they
 * are incorrect:
 *
 * Names or fields are 1024 characters long or less.
 *   Modify the define MAXFIELD line and the two lines following to create a
 *   character buffer of appropriate size and read in the entire field or
 *   name.
 *
 * Names of segments are not repeated
 *   Change the 'if (NULL == currSeg)' block to check if the name already
 *   exists and to assign the appropriate Segment to currSeg.
 *
 * Fields are assumed to be legal (that is > 3 characters in length)
 *   If not, then the some format checking (length etc.) of field should be
 *   done before calling AddField
 *
 * Names of fields are not repeated
 *   Depending on the desired behavior (overwrite or extend), modify the
 *   Segment::AddField method.
 *
 * All of the segments fit into memory
 *   If this is not the case, then there needs to some refactoring with design
 *   based on usage and available memory.
 */
#include <stdio.h>
#include <string>
#include <map>
#include <vector>

/**
 *
 * @brief  Stores a segment
 * @author enkidu
 * Stores a named segment.  A Segment consist of a single name and fields.
 * Fields consist of a three character name and a string value.
 */
class Segment {
private:
    string               fName;           ///< name of segment
    vector<string>       fFieldNames;     ///< array of field names
    vector<string>       fFieldVals;      ///< array of field values
    map<string, int>     fFieldNameMap;   ///< map of field names to indices

public:
    /// only public ctor;
    Segment(const string& inName);
    /// Returns name of segment
    string  GetName() { return fName; };
    /// Adds a field
    void    AddField(const string& inFieldName, const string& inVal);
    /// Finds the index of a field given a name
    int     FindField(const string& inFieldName);
    /// Returns the Nth field name.
    string  GetNthFieldName(int inIndex);
    /// Returns the Nth field value.
    string  GetNthFieldVal(int inIndex);
    /// Returns the number of fields
    int     GetFieldNum() { return fFieldNames.size(); }
    /// Dump the contents of the segment to stdout.
    void    Dump(void);
};

Segment::Segment(const string& inName)
: fName(inName)
{
}

void Segment::AddField(const string& inFieldName, const string& inVal)
{
    map<string, int>::iterator fieldItr = fFieldNameMap.find(inFieldName);
    if (fieldItr == fFieldNameMap.end())
    {
        fFieldNameMap[inFieldName] = fFieldNames.size();
        fFieldNames.push_back(inFieldName);
        fFieldVals.push_back(inVal);
    }
    /// Handle duplicates here.
}

int Segment::FindField(const string& inFieldName)
{
    map<string, int>::iterator fieldItr = fFieldNameMap.find(inFieldName);
    if (fieldItr == fFieldNameMap.end())
    {
        return -1;    
    }
    return (*fieldItr).second;
}

string Segment::GetNthFieldName(int inIndex)
{
    if (inIndex >= fFieldNames.size())
        return "";
    else
        return fFieldNames[inIndex];

}

string Segment::GetNthFieldVal(int inIndex)
{
    if (inIndex >= fFieldVals.size())
        return "";
    else
        return fFieldVals[inIndex];
}

void Segment::Dump(void)
{
    cout << "Name : " << fName << endl;
    // This could be done with two iterators, but this is just as fast and
    // requires only one local variable.
    int i;
    for ( i = 0; i < fFieldNames.size(); ++i)
    {
        cout << "  " << fFieldNames[i] << " : " << fFieldVals[i] << endl;
    }
}

/**
 * Reads in an AMF given an input stream.
 *
 * @arg ioStream the input stream to parse
 * @arg ioSegArray of segments in the order they are read
 * @arg ioSegNameMap map of segment names to index in ioSegArray
 */
int ReadSegments(istream& ioStream,
        vector<Segment*>& ioSegArray,
        map<string, int>& ioSegNameMap)
{

    // Initialize to start reading the name of a Segment.
    Segment* currSeg = NULL;
    while (!ioStream.eof())
    {
        // See header comments regarding assumptions of field size.
#define MAXFIELD 1024
        char buf[MAXFIELD];
        // We read in records by '|' delimited chunks.
        ioStream.getline(buf, MAXFIELD, '|');
        // After this we are assuming that buf contains an entire field.

        // if currSeg is NULL then we are starting a new segment.
        if (NULL == currSeg)
        {
            string name = buf;
            currSeg = new Segment(name);
            ioSegNameMap[name] = ioSegArray.size();
            ioSegArray.push_back(currSeg);
            continue;
        }
        // if buf is empty, then we've just read in the end of a record "||".
        if (strlen(buf) == 0)
        {
            // Start a new record
            currSeg=NULL;
            continue;
        }
        string field;
        field = buf;
        // Read in a field
        currSeg->AddField(field.substr(0,3), field.substr(3,MAXFIELD - 3));
    }

}

int main(int argc, char** argv)
{
    vector<Segment*>  segArray;
    map<string, int> segNameMap;

    ReadSegments(cin, segArray, segNameMap);

    // Access/Queries can be done here, for now just dump the contents
    vector<Segment*>::iterator segItr;
    for ( segItr = segArray.begin(); segItr != segArray.end(); ++segItr)
    {
        Segment *locSeg = *segItr;
        locSeg->Dump();
    }

    // Clean up.
    for ( segItr = segArray.begin(); segItr != segArray.end(); ++segItr)
    {
        Segment *locSeg = *segItr;
        delete locSeg;
    }
}


I suppose I'm obligated to reply (4.50 / 2) (#161)
by czth on Thu Aug 14, 2003 at 07:25:33 PM EST

Note: it would have been kinder to post your attempt either on another site (and provide a link) or deeper than the top level, so that people viewing the site not in "nested" mode don't have to scroll past it.

First, it doesn't compile (GCC 3.2.2), but that's because g++ wants fully qualified names (std::string not string). I threw in a 'using namespace std' (which I hate, but I didn't feel like replacing all occurances); still missing things (like cout) so I added an include for iostream and it compiled fine. Those things I don't consider a problem since we don't expect people to test on all possible compilers. -Wall puts out a few signed/unsigned complaints and note that ReadSegments says it returns int but actually returns nothing (changed to void).

Initial test: appears to handle the input OK (spits out an extra blank segment though), maybe because of the newline? (Nope: does the same thing with a file with no newline at the end.) Minor issue anyway but I wonder why it didn't show in testing. Generally if you could do the same thing (i.e. same degree of correctness, use of libraries, etc.) in C we'd want to interview you (which would make you the first person this month).

Would be nice: typedefs for the various containers used; encapsulate the values that ReadSegments populates and return that rather than making people pass in multiple items to be populated. If you did this, you could also do cleanup in the object's destructor (but see below). Nit: if you leave out the second parameter to string::substr you get the rest of the string.

Don't like: you've still managed to succumb to "C programmer's disease" (see article for link), in the 1024-byte field limit and instruction to "modify the define MAXFIELD line... to create a character buffer of appropriate size." There is a version of getline (std::getline(istream,string,delim)) declared in the string header that will let you read in a std::string, without regard for length, which would have been preferred.

Also with C++ you can leverage automatic cleanup via destructors and you don't need to use dynamic allocation; you can just use a vector of Segment; this also means you don't need to explicitly free the memory.

czth

[ Parent ]

That was quick (4.00 / 1) (#165)
by enkidu on Thu Aug 14, 2003 at 08:25:48 PM EST

Sorry, I don't really have an easy to access "generic posting place". Perhaps I could have put it in my diary. My apologies to everybody for the overly large post. Doh.

I didn't test it on many compilers, I even forgot to try -Wall. Double Doh.

Missed that return value for ReadSegments, I started writing it intending to return error codes, but forgot about it. I'd probably be able to write it better in C, but it would take a few more lines and take longer to debug. typedefs and such seemed superfluous in such a short hack.

I think the blank segment exists because of some eof() stuff related to streams that I never sat down to figure out.

Regarding the allocation stuff. Yeah, I still don't completely grok/trust how stl containers use the default copy constructor. Having been bitten once, I have a Pavlovian aversion to having stl containers allocate non-stl objects for me.

Thanks for taking the time to run it. Glad I passed. Since I work as a Software Engineer, it would have sucked to have failed. Any comment on the comments? I like to use doxygen to automatically make documentation. Makes code easier to navigate.

[ Parent ]

Comments on comments (4.00 / 1) (#168)
by czth on Thu Aug 14, 2003 at 08:49:20 PM EST

Re: quick - well, I guess I've gotten a system after having a manager toss a several a day at me of late.

I like the comments, we aren't quite that systematic (i.e. no doxygen); usually as detailed or more so, but just perldoc (see the alternate version of the article for a sample), with some formatting standards but a lot of laxity within those (which works well for us, the more you tighten your grip the more star systems will slip through your fingers etc. ;).

One thing that I find works for me: I've gotten in the habit of doing at least a short perldoc (or a single-line C (C++) comment for C (C++) code) for every function - even "internal" "private" functions. I find that helps me quickly scan through and get a good overview of how a module works. But in case of descriptive/short functions like most of your I wouldn't worry about it.

No software engineer worthy of the name should fail this - I was very surprised how poorly people did when I first started reviewing applicants' code. Maybe our PHBs and their agents (headhunters) just aren't looking in the right places.

czth

[ Parent ]

Your headhunters and ours both (4.00 / 1) (#170)
by enkidu on Thu Aug 14, 2003 at 09:29:29 PM EST

Just so you'd know, if I were interviewing I'd probably spend more time refining the code. In fact, I've noticed that more places are looking for example code now than before. I'm writing some non-work related code, just so I have something to show.

At my work, we've got official coding guidelines but few people follow it rigorously. They aren't laws, just guidelines. Generally, as long as I can figure out what's going on without tearing my hair out, I'm OK with limited comments.

Doxygen kicks ass even if you don't do the full javadoc style commenting. Just getting a complete class hierarchy, member and method usage, and related classes graph can easily point out design weirdness and weakness. Pretty easy to set up and runs fast also.

How about this interview war-story: Interviewing for a Unix Systems Administrator position, this guy had "experienced UNIX administrator" on his resume. Question: How would you search for a file within a directory tree? Answer: ls -lR and grep for the file name. That's when I ended my interview and told the HR lady to send him home. The kicker is, it wouldn't work, because the grep would just return the file permissions, owner etc. and name, but not the directory path.

Thanks for the feedback and "the task".

[ Parent ]

If you're not too bored... (none / 0) (#203)
by Homburg on Fri Aug 15, 2003 at 05:11:30 PM EST

...of looking at amateur attempts at your task, I'd be interested to see what you make of my attempt in C++. I'm interested to see that it's much longer (2-3 times) than any of the other responses people have posted here. Part of that may come from C++'s verbosity, but it may well be overdesigned and/or foolishly implemented. I think it is a cleaner example of modern C++ than the example above, though (no magic numbers, no use of new).

It also provided me with a good opportunity to re-aquaint myself with the some of the subtleties of the IOStreams library (istream::peek() doesn't set eofbit! istream::getline(...) doesn't extract the delimiter, while std::getline(istream, ...) does!), which is always useful.

[ Parent ]

Here's my take on it (none / 0) (#206)
by enkidu on Fri Aug 15, 2003 at 08:04:21 PM EST

I don't think you were looking for an answer from me (the bonehead who posted the original code), but I'll answer anyway. Note that I'm no software guru or teacher. Just a regular old code-slinger, waiting for a long compile and regression run to finish.

First of all your line lengths are > 80. Should be easy to correct though. Also formatting related, I prefer to have class and method comments before the class/method declaration, not between the declaration and body. Not sure what everybody else does. On the plus side, the comments made sense, but could have done with some more organization. Like separating the basic description and the algorithm.

Content related. Is there a specific reason to use a class for field instead of a pair? You could always typedef field to be a pair<string, string> if you wanted to use a specific name.

Declaring a template for named_list with two implementations, segment and message, seems like big overkill to me, especially when the usage of the two will probably be radically different. Templates are expensive with regard to compile and link time due to the fact that each object file has to make it's own copies of any methods it uses. Then they all get cleaned and relinked during linkage. This is a huge bother. (Well, Tru64 cxx uses a cxx_repository for all common template generated code, but that turns into another headache with regard to dependencies and such).

Of course, not being a C++ template guru, I could be totally wrong about this but in my opinion, by using templates, you paint yourself into a corner with regard to future improvements in speed/caching and such. Your problem implementing find_result is a good example of the inefficiencies introduced by having to work with templates and the copying that becomes inevitable. Also, I think there might be a bug in your code in the fact that you never set your sorted_objects container to dirty when additional objects are added to the named_list.

Naming, I don't like names like good_. Perhaps, good_parse or something. Name globals or statics something that indicates that they are globals or statics, like sdelimiter or s_delimiter. Use different naming conventions for methods, classes and variables. I've become used to InterCaps for classes and methods, ALLCAPS for macros, prefixes for templates, vectors, maps, globals and statics like tTemplate, gmGlobalMap and svStaticVector, and all lower for local variables. This is a bastardization of Taligent's coding guidelines I believe.

On the plus side, I think your implementaion was much more correct in terms of type-correctness, type-naming, and error catching.

Whoops, compile's all done. Hope my comments were useful. Cheers.

[ Parent ]

Comments on code (and on other comment) (none / 0) (#207)
by czth on Fri Aug 15, 2003 at 08:58:25 PM EST

I said in another comment I wouldn't be commenting on any more attempts at this task but since you asked, and asked nicely :), my comments, for what they're worth. Also replying to this comment.

Generally liked it a lot. Agree with emkidu about the line lengths but a minor nit. Also your comment style is what works for you and I'm sure you'd have no trouble adapting to whatever was prevalent wherever you worked; it's quite acceptable in terms of showing what's going on.

I disagree with enkidu about the typedefs, I'd do the same, it makes things much more readable and doesn't cost you much except a little at compile-time - names speak to purpose rather than function (e.g. fields::iterator vs. vector<pair<string,string> >::iterator). I also think a class for a field makes good sense, again in terms of code readability. If it happens to be slower than a pair then it can be optimized later (inlining etc.), but fieldObject.data() beats fieldObject.second any day in terms of reading code, especially if you just want to read a small chunk of it and make a change. Write for clarity and correctness first and optimize later.

You also seem to consistently use _ after names for member variables (have also seen the m_ prefix in VC++/MFC shops); enkidu complains but it's consistent so although I don't usually do that but have no problem with it. Oh, and Hungarian notation is horrible; I disagree with him there too (sorry :). A name should reflect an object's use, not its implementation.

Clean compile with -Wall (g++ -Wall message.cc message-test.cc -o amf), kudos. Noticed you sometimes used message.h and sometimes message.hh, though. I dislike the practice of using .hh for C++ header files, but that's a personal style issue (and I do use .cc for C++ source files). Initial tests I did all worked, even with an empty segment (which isn't legal and you wouldn't see one as input, but it's sometimes fun to test the boundaries).

Good use of the STL containers and algorithms. (Hmm... just noticed a stray "std::string line" in field::data()... typo/erroneous paste I presume.) I like that you don't sort before it's necessary, too, and that template function is also a good solution to pulling out multiple fields with the same name. Little curious why you used a char[] rather than just a string to store field names (efficiency?) and why you used the (begin_iter,end_iter) std::string constructor rather than (value_type*,length). One thing about errors, though, you should probably be propagating more information back in the exceptions you throw (either passing or to a debug output), but it's fine for a test (e.g. in real code we'd like to be able to get file and line information, perhaps, or a more detailed description).

Good solid code, anyway. Don't worry about the length, looks like it's mainly due to whitespace and comments, and that's OK. We'd interview you, too (well, if you could do the equivalent in C, as we unfortunately don't use C++ here, somewhat for political raisins and partly expedience: more languages required = harder to find people that know them).

czth

[ Parent ]

Oh go on. (none / 0) (#233)
by synaesthesia on Mon Aug 18, 2003 at 05:24:39 AM EST

I posted it all the way back on Thursday, and you still haven't told me whether or not I get an interview ;)


Sausages or cheese?
[ Parent ]
Re-read the last paragraph :P -nt (none / 0) (#235)
by czth on Mon Aug 18, 2003 at 09:44:22 PM EST



[ Parent ]
Well why didn't you just say so?! :P -nt (none / 0) (#237)
by synaesthesia on Tue Aug 19, 2003 at 07:21:48 AM EST



Sausages or cheese?
[ Parent ]
I'm 0'ing this because... (5.00 / 1) (#180)
by jjayson on Fri Aug 15, 2003 at 05:57:38 AM EST

proper ettiquette for posting code of any substantial length is to post it as a reply to a top-level comment to prevent flooding the the comment page with this stuff.

Please think before you post next time.
--
This space for rent.
[ Parent ]

My apologies (none / 0) (#196)
by enkidu on Fri Aug 15, 2003 at 02:10:30 PM EST

Sorry about that. I realized it half a second after I pressed "Post".

[ Parent ]
Code I actually saw from a college classmate (5.00 / 3) (#169)
by clarkcox3 on Thu Aug 14, 2003 at 09:16:58 PM EST

/* --1000 or so lines of code-- */ Their rationale was that the compiler seemed to work faster, the more of their code they commented out. So, they decided to comment out the entire file. They then wondered why they couldn't make their program work.

I bet that helps execution time too. /nt (4.50 / 2) (#172)
by skyknight on Thu Aug 14, 2003 at 11:03:27 PM EST



It's not much fun at the top. I envy the common people, their hearty meals and Bruce Springsteen and voting. --SIGNOR SPAGHETTI
[ Parent ]
A solution in Standard ML (4.00 / 2) (#173)
by Smerdy on Thu Aug 14, 2003 at 11:42:46 PM EST

It lives here.

It's a lot shorter and easier to follow than the one previously posted, in my opinion. It's also guaranteed not to allow a buffer overrun or crash, regardless of the input assumptions, since it's written in a type-safe language. ;-)

Nice (4.00 / 1) (#174)
by czth on Fri Aug 15, 2003 at 12:09:45 AM EST

Some more comments might have made it easier to follow for non-MLers but I did well enough, helps to have been exposed to some FP in the past. Note that the Perl solution I posted is also type safe etc., as would a solution in most "scripting" languages.

Question: do you prefer standard ML or Ocaml and why? How would the solution differ if you used (idiomatic) Ocaml?

czth

[ Parent ]

Eh (4.00 / 1) (#175)
by Smerdy on Fri Aug 15, 2003 at 12:55:10 AM EST

Some more comments might have made it easier to follow for non-MLers but I did well enough

One of the super-duper things about ML is the general lack of need for comments. You might have noticed that the types in the signature give away most of the game. I could have probably left out comments for them and you (with a knowledge of the type system shared by ML, Haskell, Clean, etc.) could have guessed exactly what they did from their names and types. When it comes to actual function implementations, things should generally be pretty clear with a little care in writing code. Breaking functions into a bunch of small functions with their own type annotations can aid in this.

Overall, if someone who doesn't know a programming language well is reading your code written in it because he is going to maintain it, then you are already sunk. ;-)

Question: do you prefer standard ML or Ocaml and why? How would the solution differ if you used (idiomatic) Ocaml?

I slightly prefer Standard ML because it is based on a standard, has a slightly more "functional" and "elegant" feel, and because I graduated from a university where Standard ML is used by all of the programming languages researchers and is integrated into the core CS curriculum. :-)

The code I posted translates trivially to O'Caml. I think you can translate it solely by replacing textual tokens with O'Caml equivalents, save for slight changes related to different standard library interfaces.

[ Parent ]

A solution in K. (4.00 / 2) (#179)
by jjayson on Fri Aug 15, 2003 at 05:52:43 AM EST

/ czth.k

/ ----- PROBLEM -----
/ Write an object-oriented parser that can parse messages of the following type
/ (AMF, ASCII Message Format) and allows ordered traversal as well as searching
/ for segments and fields within the current segment by name:
/
/ A message is an ASCII string of segments each ending with a double-pipe ("||").
/ A segment consists of one or more pipe ("|") separated strings; the first
/ string is the segment's name.  The remaining strings are fields; the first 3
/ characters are the field name, the rest are the data.  You may assume that
/ the input data is valid.
/
/ Sample message:
/
/ fmlog|OIDHICAD1|LCLBASRC|RIDHILEX|TSP1031348701||IACN|TYPL|PROCHHCH|LLL1A|BKCA** NGH|SOUAGENT||txnorg|OIDHICAD1|LCLBASRC|TRM10.191.0.25|TIPAMFSAM|RIDHILEX|SIDG86 00||addr1|TYPH|STR4554 Eleventh Street|APT|STANY|ZIP07440-1913|COUUS||addr2|TYPB|STR202 Fifth Avenue|APT12|STANY|ZIP07410-1001|COUUS||
/
/ The code you write should be extendable production-level code.

/ ----- SOLUTION -----
/ p[m]: given message m, parse into named set vectors (parsed message)
p:{+{(*x;+{(3#x;3_ x)}'1_ x)}'{1_'(&x="|")_ x:"|",x}'-1_2_'(n _ss"||")_ n:"||",x}

/ s[set; name]: search for given name in set. If set is a parsed message, then
/ name should be the name of a segment. If set is a segment (such as from a
/ previous call to s), then name should be the name of a field.
/ ex: s[p m; "txnorg"] returns the named set representation for the txnorg segment
/     s[s[p m; "IACN"]; "BKC"] return the value from field "BKC" of segment "IACN"
s:{x[1]@x[0]?y}

/ For more advanced search capabilities, just use the K vector operations and
/ regular expressions. For example to find all fields that that start with "T"
/ from all segments while retaining proper shape:
/    q[1;;1]@'&:'q[1;;0]_sm"T*"
/ returns:
/    (,"1031348701"
/     ,,"L"
/     ("10.191.0.25"
/      "AMFSAM")
/     ,,"H"
/     ,,"B")

/ ----- NOTE ON OO -----
/ Object-orientation in K is through the vector. Data is sliced field-wise, instead
/ of column-wise (object-wise). You do not iterate explicitly, but implicitly or
/ by using adverbs.


--
This space for rent.

I don't think this satisfies the specification (4.00 / 1) (#190)
by Smerdy on Fri Aug 15, 2003 at 11:52:39 AM EST

under my interpretation of it because:

  • There is no data abstraction/"object orientation". The user of the code is required to know about your implementation (that you used a vector)

  • You use linear search for fields, which is unacceptably inefficient for "production code" as I understand it. Not even parsing the string and searching it anew each time for a segment or field would be about as fast as your solution.


  • [ Parent ]
    No, and no. (5.00 / 1) (#201)
    by jjayson on Fri Aug 15, 2003 at 04:09:50 PM EST


    • There are only vectors in K. To just use the search function (s), you do not need to know about anything. However, to use the K primitives, you do need to know about the named-set construction, a common construction in many programs. K's idiomatic code and data structures mean that you often reuse the same forms and get to know how to deal with them very well.

      There is no C++-style OO in K. In an object, all data for a single instance is inside the object, and if you need to add 4 to the entire set, you iterate through each object adding 4 to the appropriate field. However, in K data is sliced the other way; there are columns of data. To add 4 to all instances, you add 4 to the appropriate column. Much more efficient than any other way; it's a collection-oriented language, not scalar-oriented.

    • The find primitive (?) is not always linear search. It choses an appropriate method and is often blisteringly fast. We don't know how many segments are in each message, and we don't know how many fields are in each segment. Without knowing your input data, it is not possible to optimize very well. If the example message is any indication, then linear search will often be faster than more advanced hashing methods.

      If this is too slow, it is about a dozen characters to implement either a hashing method or binary search (hash uses the same find primitive, while binary search uses the _bin primitive).

      Making the claim, "It uses linear search so it isn't production quality," without knowing any benchmarks or even possible inputs is silly.

    It's a single line of code! How hard can it be to change 1 line? I can rewrite this from scratch each time a change is needed and development time will still be faster than in any other language. K's often has brutal speed when dealing with bulk data too.

    K is for Krunk.

    --
    This space for rent.
    [ Parent ]

    Response (3.00 / 1) (#204)
    by Smerdy on Fri Aug 15, 2003 at 06:22:47 PM EST

    To "there are only vectors in K" as a justification for no abstraction: are you saying that if you chose to use a different structure (like a balanced search tree or anything not built into the language/compiler) the same code would suffice? There would be no way for the code user to have depended on the old format and thus be forced to change all of his calls and recompile?

    I am also unconvinced that the K language brings any advantages over functional languages. Is there any fundamental reason why the K features that you like can't be implemented just as efficiently in ML? With ML's (or Haskell's) user-defined infix operators, I think you could even use mostly the same syntax with a little bit of work defining a K emulation library. This would leave you with all the usual ML features for the non-vectorized computations that you say K isn't suited for.

    [ Parent ]

    K is another, older world (5.00 / 1) (#205)
    by jjayson on Fri Aug 15, 2003 at 07:41:22 PM EST

    How you operate data is determined what you need to do. There is often a better way to handle data than balanced trees. K has binary search verbs (_bin, _binl), so to get similar performance characteristics you just use a sorted list (insertion is handled in bulk and, with K's excellent memory handling, amortizes to log performance, or maybe even linear if your data is well behanved enough).

    Also, like I said, if you are really concerned about that, make a function and juse use it (like I did with the search function s). Then you can change implementation, just like any other functional construct.

    Yes, you could implement K-like characteristics in C, ML, Scheme, C++, Java, assembly, or any number of languages. However that isn't saying much. It's like asking my not just implement ML in Lisp. I don't really have a desire to expound on why this isn't the same thing though (too tired).

    K isn't compiled either. It is entirely interpreted. KDB, the database written in K, is still faster than anything else on the market.

    Check my story history for the "Shallow Introduction to K" story I wrote last year if you want.
    --
    This space for rent.
    [ Parent ]

    Oh, the coolest thing (5.00 / 1) (#202)
    by jjayson on Fri Aug 15, 2003 at 04:12:49 PM EST

    It extended to arbitrary depth fields: (p q)s/a
    will search message q at depth given list of keys a.

    This is just for free. No code changes needed.
    --
    This space for rent.
    [ Parent ]

    nice, but... (4.00 / 1) (#187)
    by ksandstr on Fri Aug 15, 2003 at 10:31:17 AM EST

    "glib" doesn't stand for GNU Library.  Just g utility library, gtk utility library or something like that.  No need to read too far into every /g.+/ name, you know?


    Fin.
    Gimp library (none / 0) (#231)
    by jjayson on Sun Aug 17, 2003 at 03:13:48 AM EST

    At least originally, I think.
    --
    This space for rent.
    [ Parent ]
    Spurious braces (4.00 / 2) (#193)
    by Merc on Fri Aug 15, 2003 at 12:51:06 PM EST

    if(donaldDuck[idd].goofy.size() > 0)
    {
      for ( i = first; i<last; i++ )
      {
        i2++;
      }
    }

    You give this example when talking about spurious braces. I happen to believe that you should always use braces on if, for, while, and other similar constructs. Why? Indentation.

    Especially when idi^H^H^Hpeople choose to use tabs in their source files, the indentation can really look odd. Add to that the idea of posting code to a website (where whitespace is collapsed) and you can have some real problems. Because of that, it isn't always easy to what statement is associated with a given branch.

    if(donaldDuck[idd].goofy.size() > 0)
    for ( i = first; i<last; i++ )
    i2++;
    else
    i2=-1;

    At a glance, can you tell me if the above is correct? Can you be sure that i2++ is within the for loop, or is it part of the if statement? Now compare with braces:

    if(donaldDuck[idd].goofy.size() > 0)
    {
    for ( i = first; i<last; i++ )
    {
    i2++;
    }
    }
    else
    {
    i2=-1;
    }



    Moot point (none / 0) (#197)
    by czth on Fri Aug 15, 2003 at 02:57:44 PM EST

    - because I indent my code. Perhaps others don't, but they will be shot as needed. With proper indentation I can tell quite easily. Without improper indentation I can usually track it down and deliver thwacks to the guilty quickly too.

    I'm a minimalist and usually don't use braces where unnecessary except: with the dangling else (to make it clearer) or if it's a big enough block (again, for clarity).

    There's a line each person chooses to draw between safe and wasteful. For example I sometimes like to switch ==s where possible to avoid accidentally typing = (e.g. if(2 == a) rather than if(a == 2)) but some people would argue that's horrible and sacrifices clarity; I argue wasting vertical screen space with too many braces can do the same thing. It's often personal, not business - just like brace placement itself (1TBS forever! for much the same reason).

    czth

    [ Parent ]

    too much noise, too little signal (none / 0) (#230)
    by jjayson on Sun Aug 17, 2003 at 03:11:30 AM EST

    With your first little code segment you have 7 lines of code. I would have written that as 2:
    if(donaldDuck[idd].goofy.size()>0)
      for(i=first; i<last; i++) i2++;
    You have so much damn space. I save 70% of the vertical space. I can squeeze 245 semantic meaningful lines in the same space you fit 70. I can view entire algorithms, split across multiple function, on the screen without paging all over the place. I can zone out on a large code segment to debug. I can't stand all the space.

    Actually, I would probably lift the for-loop onto the conditional line as that demonstrates the flow better. Learn to read dense notation and you will learn to love it.
    --
    This space for rent.
    [ Parent ]

    dont knock comments (4.33 / 3) (#210)
    by bolthole on Sat Aug 16, 2003 at 03:08:28 AM EST

    I cant believe you make such a big deal about "production level" code... and the first thing you do, is criticise a guy for writing production-level style comments!

    Sure it was overkill for the particular project, but it also shows that the guy knows how to write seriously large-scale code, and keep it organized.

    Or is your real gripe that you just didint like his code? in which case, you should have left out the example entirely.

    Or, contrariwise, shown it as a POSITIVE example of how to do comments, and not mentioned the quality of the actually submitted code.

    That is of course, assuming that you actually WANT to have a team that can code to very large levels of complexity, and maintain it well. The bulk of your rant seems to IMPLY that. But your initially described actions make me wonder.

    Those are not production level comments (4.50 / 2) (#229)
    by jjayson on Sun Aug 17, 2003 at 03:01:33 AM EST

    They're useless ASCII art. Production level code means as minimal as possible, but no more minimal than needed. When half a file is filled with that crap it becomes distracting.
    --
    This space for rent.
    [ Parent ]
    Gaaahhhh.... (none / 0) (#236)
    by bolthole on Tue Aug 19, 2003 at 01:54:04 AM EST

    You're not paying attention in class, obviously.

    "minimal comments" are "useless comments".

    "minimal comments" == "understandable to YOU" (right now, as you're hip deep in the code)

    A decent amount of commenting, helps someone ELSE understand the code. And when you come back in 5 years and have to look at it, you will essentially be in the shoes of 'someone else' looking at the code, comparative to when you wrote them. The sooner you learn that, the sooner you'll learn how to write decent comments.

    and someday, you WILL be in that position, unless you have completely given up looking at code, or are incapable of it in 5 years.

    --Speaking as someone with 20+ years of coding experience.


    [ Parent ]

    I like comment ascii frames (none / 0) (#238)
    by phred on Wed Aug 20, 2003 at 04:49:09 PM EST

    If I'm in a hurry, I like to stop at function headers to read what the function does, and those great big eye catching ascii things say "read me". Minimalistic comments are ok if you plan to read the entire body of code, but I like those ascii things, they're like bookmarks.

    [ Parent ]
    %AMF_fields *is* threadsafe (5.00 / 1) (#212)
    by DylanQuixote on Sat Aug 16, 2003 at 07:45:55 AM EST

    %AMF_fields, *is* thread-safe. In perl (5.8),
    variables are not shared between threads.
    You've got to share() stuff in order for
    it to be shared.

    I've actually seen people do eval "use $Module";
    instead of require "$Module.pm"; import $Module;


    heh... (3.00 / 4) (#213)
    by pb on Sat Aug 16, 2003 at 12:56:27 PM EST

    I just read this article on your site, czth; you caught my attention with some humorless (but likely correct) moderation you did to one of my comments, so I figured I'd find some way to reply, to test your sense of humor.

    Here is a perl submission that may fulfill all objective criteria in your problem statement; it certainly seems to parse the example, at least. Also, it's shorter than the provided sample data (but naturally not shorter than jjayson's K submission!):

    $_="fmlog|OIDHICAD1|LCLBASRC|RIDHILEX|TSP1031348701||IACN|TYPL|PROCHHCH|LLL1A|BKCA**NGH|SOUAGENT||txnorg|OIDHICAD1|LCLBASRC|TRM10.191.0.25|TIPAMFSAM|RIDHILEX|SIDG8600||addr1|TYPH|STR4554 Eleventh Street|APT|STANY|ZIP07440-1913|COUUS||addr2|TYPB|STR202 Fifth Avenue|APT12|STANY|ZIP07410-1001|COUUS||";
    $o=sub{my@s;foreach(split/\|\|/){my@t=split/\|/;my$n=shift@t;my%d;foreach(@t){$d{substr($_,0,3)}=substr($_,3);}push@s,{"n",$n,"d",\%d};}return@s;};bless$o;my@s=&$o($_);foreach$k(@s){print$$k{"n"}."\n";foreach$l(keys%{$$k{"d"}}){print"\t$l=>".$$k{"d"}{$l}."\n";}}

    Cheers.
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall

    Can you please try to shorten it a bit? -nt (4.00 / 1) (#214)
    by czth on Sat Aug 16, 2003 at 01:48:21 PM EST



    [ Parent ]
    naturally, (none / 0) (#215)
    by pb on Sat Aug 16, 2003 at 02:50:50 PM EST

    but I'm not a one to skimp on detail; I wouldn't want to shorten my code by 20 characters at the expense of its object-oriented, encapsulated nature, for example, because that would be against the spec. This isn't Perl Golf, after all. :)
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]
    page-widener gets a 0 rating from me /nt (1.00 / 2) (#217)
    by cbraga on Sat Aug 16, 2003 at 05:38:34 PM EST



    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]
    well, we all have our pet peeves. (3.50 / 2) (#219)
    by pb on Sat Aug 16, 2003 at 08:57:42 PM EST

    For example, abuse of the zero rating by rating actual, on-topic content of mine gets a zero rating from me; go figure.
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]
    Well... (none / 0) (#220)
    by cbraga on Sat Aug 16, 2003 at 09:05:23 PM EST

    I won't argue your rating-policy, but your content looks like it came from /dev/random

    Heck, proper formatting can't be that hard to do.

    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]

    proper formatting? (none / 0) (#221)
    by pb on Sat Aug 16, 2003 at 09:22:38 PM EST

    As I believe I mentioned, my code may fulfill all the objective criteria set forth in the spec; I said nothing of the subjective criteria. However, if you object to the style used therein (which is typical of a certain breed of Perl programs) then perhaps you could shell out $0.00 for something that reformats it to some other format that may or may not meet your particular specifications.

    In no case would it change the meaning of the program in question, however, which you are free to test and execute as is or otherwise. As it is incredibly statistically unlikely for /dev/random to produce a valid perl program of even this length, I will give you the benefit of the doubt once and assume that you were simply ignorant of this fact, or didn't take the time to perform even the most elementary of statistical analyses instead of actually ascribing malicious intent by your carelessly chosen words.

    Good day.
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]

    I hereby define proper formatting as (none / 0) (#222)
    by cbraga on Sat Aug 16, 2003 at 09:28:53 PM EST

    something that doesn't screw up the screens of every user's browser.

    You also presume to insult my intelligence arguing that your program could never come out of /dev/random, but I never said that. I said it looks like it came out of /dev/random. Haha, your time was wasted.

    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]

    heh. (3.00 / 1) (#223)
    by pb on Sat Aug 16, 2003 at 09:35:34 PM EST

    That's the great thing about subjective definitions; you can define them however you wish. However, I highly doubt that my formatting screwed up the screens of "every user's browser", and hence, it was properly formatted.

    My program doesn't even vaguely look like something that came out of /dev/random; what are the odds that there would be no whitespace, for example, or no high ascii characters, or that high a percentage of lowercase characters? If you can figure out how to use uudecode, then you can compare them for yourself; here's a similar amount of output from /dev/random, uuencoded. (note that it wouldn't post otherwise--another obvious difference)

    begin 644 -
    M`&JIN`I@;EZ=(V]:YFZND[3()G3:O$>SYX87_Q)_J^?O3-3+#024]\C3[$EJ
    M7'#K:U%*'T>B1-IDM;]=;>FE&X/N_3A09$6'UA>G#3HE$4S`JH/F4)Z=,A=9
    M0_^P7D:A[ZQ`;6.D@W>._J5'KP*4#LU(`3S98#M8@@F]K5IE4SUC,EU7WO1?
    M^&>;52>LE+V=&&6M<B8FWKT1IP79L3:)2#E2^C((3K,WTEC`QWBW?L7+`OU!
    M7].'-I1[-E$FONOJ4%QC:DM;A,7V2S_K8RCP<F_X'-AHV;QKI24RD>_.S\L!
    M")E=9_K)N+*;;$WE\=>'Y"1'ZM:RX(/94/"A/]L.4'ZD1BZ97V(0XO-MXEI1
    M'Z(%X%I5D$RFN@/VH/H4GK<(HNVTQ0[3;1[/?X<(Q\'\Y?577L;:,25)*\I!
    MRA;X-50^4]_O&+DL`.Z4B[VFY)QU/C&EGY0^M\R[SQS'73VLGIQA3(7=R3R1
    M6XIY&)PCJ@;@%X]7BKPL@MV\R_!`R6\6\-*>($AR1F,6=K2*[$8&N2&0^&B.
    M(=V1F;20@'*].ZH!1SS;+O:T+S]B:%H(H8X7^<A1\9]KVANMK]+U;76;O3;C
    MY4@&7%AHM/<=5H_`\?:PMC[HC>?[;U`.Z?EEK2\$NS=,$Y.R*-9.JF?NR'(7
    MXX-*<PN%S*'#]<64YZM90%F;_O?5BFB[5$"&ZJ[>Y6+4^E728!FQ/5^/?1=P
    HKI3&?V79_7STKV"WADZ.D\7[PYBQ%(SMWW3.HD+GHI%*HU*8UGIE`0``
    `
    end

    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]

    Are you still trying to insult my intelligence? (none / 0) (#224)
    by cbraga on Sat Aug 16, 2003 at 09:51:52 PM EST

    You're not coming even close. In fact, looking at the sample you so conveniently provided I am even more convinced your code looks like it came from /dev/random. They're both a bunch of random uppercase letters and have no structure.

    In fact, your sample of randomness has even more structure since it's conveniently separated in lines so as not to screw up the window in my browser. Thanks. Perhaps you could have uuencoded your program as well?

    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]

    apparently I don't need to, (none / 0) (#225)
    by pb on Sat Aug 16, 2003 at 10:01:30 PM EST

    seeing as how (a) you apparently don't know how to use uudecode, and (b) you also can't seem to differentiate uppercase from lowercase.

    I looked into compressing and uuencoding my program, actually, but it would have made it larger. If that doesn't make sense to you, then just take my word on it; it has to do with how uuencode works, and also with how compression file formats have extra information stored in their headers as well.
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]

    Here you go (5.00 / 1) (#226)
    by cbraga on Sat Aug 16, 2003 at 10:11:43 PM EST

    Your program uuencoded. Doesn't it look much better?

    begin 644 p
    M7STB9FUL;V=\3TE$2$E#040Q?$Q#3$)!4U)#?%))1$A)3$58?%134#$P,S$S
    M-#@W,#%\?$E!0TY\5%E03'Q04D]#2$A#2'Q,3$PQ07Q"2T-!*BI.1TA\4T]5
    M04=%3E1\?'1X;F]R9WQ/241(24-!1#%\3$-,0D%34D-\5%)-,3`N,3DQ+C`N
    M,C5\5$E004U&4T%-?%))1$A)3$58?%-)1$<X-C`P?'QA9&1R,7Q465!(?%-4
    M4C0U-30@16QE=F5N=&@@4W1R965T?$%05'Q35$%.67Q:25`P-S0T,"TQ.3$S
    M?$-/5553?'QA9&1R,GQ465!"?%-44C(P,B!&:69T:"!!=F5N=65\05!4,3)\
    M4U1!3EE\6DE0,#<T,3`M,3`P,7Q#3U554WQ\(CL*)&\]<W5B>VUY0',[9F]R
    M96%C:"AS<&QI="]<?%Q\+RE[;7E`=#US<&QI="]<?"\[;7DD;CUS:&E F=$!T
    M.VUY)60[9F]R96%C:"A`="E[)&1[<W5B<W1R*"1?+#`L,RE]/7-U8G-T<B@D
    M7RPS*3M]<'5S:$!S+'LB;B(L)&XL(F0B+%PE9'T[?7)E='5R;D!S.WT[8FQE
    M<W,D;SMM>4!S/28D;R@D7RD[9F]R96%C:"1K*$!S*7MP<FEN="0D:WLB;B)]
    M+B)<;B([9F]R96%C:"1L*&ME>7,E>R0D:WLB9")]?2E[<')I;G0B7'0D;#T^
    7(BXD)&M[(F0B?7LD;'TN(EQN(CM]?0H`
    `
    end


    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]

    hardly. (5.00 / 1) (#227)
    by pb on Sat Aug 16, 2003 at 10:22:27 PM EST

    That isn't perl, hence, no one here can readily read it; they'd have to cut-and-paste it somewhere else and uudecode it first. How much of a hassle is that? Besides, now it isn't all on one line anymore!  ;)

    Nice job with the formatting, though. I generally just choose plain text instead of taking the time to make it all work (which would usually involve messing with sed, although perhaps it'd be easier to escape characters and use Auto Format now that we have it)
    ---
    "See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
    -- pwhysall
    [ Parent ]

    Thanks. /nt (5.00 / 1) (#228)
    by cbraga on Sat Aug 16, 2003 at 10:29:35 PM EST



    ESC[78;89;13p ESC[110;121;13p
    [ Parent ]
    You need to start a mentoring program. (5.00 / 1) (#232)
    by lukme on Mon Aug 18, 2003 at 12:32:18 AM EST

    It is the experienced person who should train/mentor the new guy.

    Any time someone new is brough onto a project, they should be sat down with an experienced programmer who know the product and work closely with this programmer on at least one project.

    If a mentoring program is not part of your coporate culture, then perhaps you should start one - it would nip a number of your problems in the bud, before they become major problems.

    Furthermore, when a newer guy is working on your code, try to find out what it is that he does to it (This only works if you know him and that he is working on your code). I have been able to do this several times, and in each case I was able to tell the newer guy how to do it better (ie, with fewer lines and more clarity).

    It is the experienced people amoung us who have a responsibility to instill a sence of craftsmanship for the people who come after us. Let's face it none of us live forever, nor will work at one job forever. If we don't pass on our hard won experience, who will?




    -----------------------------------
    It's awfully hard to fly with eagles when you're a turkey.
    Certainly Not Logic | 244 comments (203 topical, 41 editorial, 0 hidden)
    Display: Sort:

    kuro5hin.org

    [XML]
    All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
    See our legalese page for copyright policies. Please also read our Privacy Policy.
    Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
    Need some help? Email help@kuro5hin.org.
    My heart's the long stairs.

    Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!