Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
A Software Development Play in One Act

By Shimmer in Technology
Sat Feb 02, 2002 at 12:29:25 AM EST
Tags: Software (all tags)
Software

Where I work, the debate over "stateless" vs. "stateful" application development is all the rage. Those of you who are familiar with these issues might find some humor in the following play.


[Late at night, outside a Gateway Country store. A group of middle-age men are seen huddled around a monitor.]

A police car creeps up slowly with its lights off.

Officer 1: That look like Slashdot to you?

Officer 2: They sure ain't surfin' Disney, I'll tell you that.

Officer 1: I'll take care of this.

[The police car stops and Officer 1 jumps out.]

Officer 1: Hands off that keyboard, boys!

[The middle age men turn to run, but one trips and falls. Something drops out of his pocket onto the ground. Officer 1 approaches him.]

Officer 1: Is that an object, boy?

Man: No, sir, officer. That's just data.

Officer 1: Don't lie to me. That's an object. [He picks it up.] Look, it's got methods.

Man: They're static, I swear! No "this" pointer!

Officer 1: Hmmm... Show me your XML certification.

Man: [Stuttering] I... I ... I left it at home.

Officer 1: Tell me the truth -- you're not certified, are you boy?

Man: [Quietly] No, sir, truth is, I'm not.

Officer 1: [Sympathetically] How do you explain yourself?

Man: I was just executing some business rules, sir. Nothing too stateful.

Officer 1: But you know better, don't you?

Man: I suppose so. ... I guess I should've used a cookie.

Officer 1: That's right, you should've used a cookie. And for God's sake, son, put those rules in a stored procedure! Don't you have any pride in yourself?

Man: [Looking down] Okay. I will.

Officer: I'll let you off this time, but I'm going to have to confiscate these [he sneers] "operations". [He breaks the object in half and puts one piece in his pocket. He hands the other piece to the man.]

Man: [Sheepishly] But, this isn't my data, sir. This is XML.

Officer: You'll take it and you'll be happy about it. This is a service I'm providin'! Next time, think before you bring methods to a public place, you hear?

Man: Yes, sir. [He takes the XML.]

[The officer returns to his car and it drives away. The man puts the XML in his pocket and walks slowly offstage.]

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
Which camp are you in?
o Stateful 25%
o Stateless 5%
o Neither/Both 28%
o WTF? 40%

Votes: 71
Results | Other Polls

Related Links
o Slashdot
o Also by Shimmer


Display: Sort:
A Software Development Play in One Act | 36 comments (31 topical, 5 editorial, 0 hidden)
Stateful! (3.83 / 6) (#3)
by TheophileEscargot on Fri Feb 01, 2002 at 02:09:56 PM EST

I've done some stateless development, which had some performance benefits, but IMHO it's a huge pain in the arse, especially when you get into passing about a million arguments into every damn function.

I guess I'm an object-oriented dinosaur. State and methods belong together, dammit!
----
Support the nascent Mad Open Science movement... when we talk about "hundreds of eyeballs," we really mean it. Lagged2Death

To a point... but only to a point... (3.75 / 4) (#6)
by aziegler on Fri Feb 01, 2002 at 02:52:19 PM EST

State (data) and methods belong together, but only to a point. This point is while a particular program is executing them. The reality is that data can -- and does -- exist without an execution context, and it's up to the programs that can operate on the data to provide a context. By keeping data and methods together all the time, you essentially are stating that the data can only be operated on by one program in one way. This is one reason that I think that object databases are worthless.

Passing around heavyweight objects (data + methods) just isn't really a good idea for distributed development. Lightweight objects (data + metadata), with defined ways of operating on those objects, are much more efficient and scalable.

Of course, I've also found that most people don't know diddly about entity relationship design, which is actually fundamental to OO but is discounted because it's also vital to relational databases. Data's utility is defined by its relationships with other data, not by the methods that operate on it. Good OO design cares only about the shape of the data while the program is operating on the data.



[ Parent ]
Maintainability vs. performance (3.50 / 2) (#7)
by TheophileEscargot on Fri Feb 01, 2002 at 03:49:26 PM EST

The thing is, when you've got a team of people working together, it's safer and easier if you keep the state and methods together. Whenever you separate them, you introduce more chance of a fuckup where developer X messes up the life of developer Y.

We software developers are always told that we think too short-term, and wrongly sacrifice maintainability for performance benefits that will be irrelevant in 5 years. Now we're being told the opposite.

I think the word scalability is turning into a euphemism for performance. We know we're not allowed to break encapsulation just for performance, but it's OK as long as it's for "scalability".
----
Support the nascent Mad Open Science movement... when we talk about "hundreds of eyeballs," we really mean it. Lagged2Death
[ Parent ]

Disagree... (5.00 / 1) (#19)
by aziegler on Fri Feb 01, 2002 at 10:40:55 PM EST

I disagree. Keeping data (state) and methods together does not necessarily decrease the chance of developer X fucking up the life of developer Y because of changes made. My specialty for the last few years has been riding herd on developers as a DB designer and making sure that database changes are made logically and properly. I did what I could to teach others proper ER concepts. We had more problems from people changing the class signatures or method signatures than from the database changes.

I think you're also misreading my objection. I am all for OO development practices (even in non-OO languages). I am against passing around heavyweight objects in distributed development or serializing heavyweight objects. (The last place that I worked before I decided to strike out on my own had some CORBA work -- and they had to switch to lightweight objects because the performance difference was at least two orders of magnitude; they still couldn't match the performance of the non-CORBA distributed implementation that was alrady in place.) Heavyweight objects essentially tie you to a single language or runtime (e.g., the JVM or the CLR, or something else like that) -- otherwise, how are you going to operate on the object? I don't know much about SOAP, but I suspect that it depends more on lightweight objects (but I suspect it has other flaws).

My argument isn't for breaking encapsulation; it's for not enforcing encapsulation when it isn't appropriate. (Plus, the performance aspect of sending a heavyweight object across the wire is, at least in my experience, nasty. Manage your transmission transactions properly, though, and your programs won't even really "know" that the objects weren't heavyweight to begin with.)



[ Parent ]
Why was that? (3.50 / 2) (#22)
by greenrd on Fri Feb 01, 2002 at 11:51:31 PM EST

they had to switch to lightweight objects because the performance difference was at least two orders of magnitude;

Between lightweight and heavyweight, or between non-CORBA implementation and heavyweight?

If the former, can someone more knowledgeable about CORBA than me hazard a guess as to why that should be?


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

The former... (4.00 / 1) (#26)
by aziegler on Sat Feb 02, 2002 at 03:44:46 AM EST

The two orders of magnitude was between lightweight and heavyweight. Basically, there's three ways to implement CORBA. The first is to pass a reference and call a function -- over the wire -- on your distributed structure. This is the worst of the three possibilities for objects that you're going to have to be dealing with a lot. This is mostly because you're going to be sending data in little packets and making a lot of round-trips to the server.

The second is to pass around a heavyweight object. This, of course, only works when you have the same language on both ends (which sort of defeats the purpose of CORBA in the first place); if you don't, you're back to treating the heavyweight object as a remote reference with lots of round-trips to the server. Even if you do have the object, you may find that you actually have to get another object so that your object is complete and usable. Finally, you may get the object and find out that it doesn't actually do anything you need it to do (there's no method available) and you're having to do the same amount of work that you'd have to do if all you had was a lightweight object.

The lightweight approach basically sends a structure across the wire. There may be references to some methods available, but you're sent as much information as possible across the wire at once so that the roundtrip costs are less than they might otherwise be with either of the above two approaches. It's not perfect, but it's certainly more scalable than the other two.

The non-CORBA implementation was faster yet than the lightweight objects, but it was an untyped system (in fact, for a good portion of its life, it was fixed-width fields instead of delimited fields). Implementation and management of the non-CORBA system was much harder (you had to manage two different sets of code for encoding and decoding -- and both sides had to be able to do both, because one was in C and the other was in C++). That's not to say that a non-CORBA system requires that. I can think of a few easy ways to simplify the encoding and decoding problems that were faced -- and make it so that there was only one codebase to be maintained instead of two or more.



[ Parent ]
OO databases don't have to be brittle (3.00 / 2) (#9)
by greenrd on Fri Feb 01, 2002 at 05:45:55 PM EST

By keeping data and methods together all the time, you essentially are stating that the data can only be operated on by one program in one way. This is one reason that I think that object databases are worthless.

If you couldn't easily get at private data that you need to meet new requirements that would be a strong objection. But (a) are properties which the client can only write to but not read from, really that common in OO databases in practice? and (b) in order to change the behaviour of existing objects, we need good schema evolution support in OO databases (my research topic, incidentally). With sufficiently flexible schema evolution support, data can be migrated on demand or en masse to a new schema providing the required functionality; hence your objection is, shall we say, "worthless". ;)


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

OO databases are by definition brittle... (4.00 / 2) (#18)
by aziegler on Fri Feb 01, 2002 at 10:30:07 PM EST

When you tightly tie your schema and your operations together, you're going to find it difficult to support new operations -- because you have to change your schema. This is the reality of entity relationship design -- a specialty of mine. In my prior response to you, I pointed you to a response of mine to Carnage4Life's OODB article last May, where I pointed out that OODBs are practically worthless at supporting multipath access relationships because OO can't model such without using pointers (bad) or breaking such relationships into separate objects (which means that it's no better than a relational implementation with worse performance).



[ Parent ]
Data+methods not heavier than data+metadata (3.00 / 2) (#10)
by greenrd on Fri Feb 01, 2002 at 05:55:29 PM EST

Passing around heavyweight objects (data + methods) just isn't really a good idea for distributed development. Lightweight objects (data + metadata), with defined ways of operating on those objects, are much more efficient and scalable.

Nonsense! The methods only need to be transferred once for each class, each time the class changes. That should mean a negligible overhead. What makes you think passing a few classes around will slow it down too much, if done efficiently?


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

Lock-in (4.00 / 1) (#17)
by aziegler on Fri Feb 01, 2002 at 10:25:18 PM EST

This status still locks you into a single architecture. Not all languages are actually appropriate to all problems (I'm sure you're aware of this, but it's worth restating). When you provide a description of the data, and then the data itself (e.g., a DTD and data, or a relational database), then you're able to manipulate and query the data in ways that weren't envisioned by the original developers. Indeed, such techniques are the strength of lightweight objects as opposed to heavyweight objects.

I might choose to do a simple transformation on the data -- have a Perl front-end which takes the data and does a quick validation before sending it to a more robust C++ back-end for processing. I can't do this if I'm using heavyweight objects. I can if I'm using lightweight objects.

The real problem I see with the standard OO advocacy is that it says that data+methods = object, but forgets that there are many ways of viewing the data. (See this comment I made last year.)



[ Parent ]
I don't understand (3.50 / 2) (#21)
by greenrd on Fri Feb 01, 2002 at 11:40:16 PM EST

When you provide a description of the data, and then the data itself (e.g., a DTD and data, or a relational database), then you're able to manipulate and query the data in ways that weren't envisioned by the original developers. Indeed, such techniques are the strength of lightweight objects as opposed to heavyweight objects.

That sounds similar to the argument for weak typing.

Anyway, there are two possible scenarios here, which mark opposite points on a spectrum. Information hiding can be very frustrating if you have to interface with closed-source code which hides too much - or code that's hard to understand, or expensive to change. However, in an ideal system where all the source code of interest is available and clearly-written, and where rapid refactoring is not a problem (this is where schema evolution comes in if you're using an OODB, although it's not really up to scratch in existing OODBMSs yet), you can use information hiding to separate concerns to make the system easier to understand and review, and refactor as needed to deal with changing requirements. Information hiding is not meant to make people's lives harder pointlessly, it is meant to cleanly separate concerns and thus both simplify coding and reduce errors. If you have the source available (and you have either schema evolution support or no persistent instances of this class) you are not forever locked out from the private members of a class - you can modify the class.

I appreciate that many real-world systems do not approach this ideal system. Of course, mixing multiple languages, or freezing interfaces too early, act as impedances to refactoring.

The real problem I see with the standard OO advocacy is that it says that data+methods = object, but forgets that there are many ways of viewing the data.

If you need more views, use wrappers or modify the original class.

See this comment I made last year.

I don't understand that comment. What do you mean by "child"?


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

It's in the design approach... (4.00 / 1) (#27)
by aziegler on Sat Feb 02, 2002 at 04:17:59 AM EST

When you provide a description of the data, and then the data itself (e.g., a DTD and data, or a relational database), then you're able to manipulate and query the data in ways that weren't envisioned by the original developers. Indeed, such techniques are the strength of lightweight objects as opposed to heavyweight objects.

That sounds similar to the argument for weak typing.

Not at all. I'm currently developing in Pascal (it's not Object Pascal, but the targeted platform doesn't have that yet), and have spent a good portion of the last few years with an Ada derivative (PL/SQL) as well as C++, Java, and C. Strongly typed languages are much more pleasant to use than weakly typed languages for most projects, IME, though weakly typed languages have their place. Rather, what I'm trying to get at is a different point of view: data has value in and of itself, independent of the algorithms that might be applied to the data. Many OO advocates, designers, and developers forget this, and run into all kinds of trouble when they find out that they need to access the data -- still strongly typed for the fields -- in different ways than was programmed. (Ideally, this could be fixed by an MVC pattern, but even that has its limits, because it still causes you to think in terms of operations on the data; operations on the data are independent of the type of the data, really -- and I know that's heresy to OO.)

Anyway, there are two possible scenarios here, which mark opposite points on a spectrum. Information hiding can be very frustrating if you have to interface with closed-source code which hides too much - or code that's hard to understand, or expensive to change.

Really, it doesn't have so much to do with the availability of source code as it does how one approaches design. When I do database design, I design in terms of the entity relationships -- how the data are related to each other. When I do OO program design, I design in terms of the operational relationships -- how the data are operated upon. Both are useful, but the operational relationships are limited to a particular problem domain which may not cover the entirety of the utility of the data. It's far more productive, in my experience, to consider the entity relationships when you're talking data store (and this can include transmission across a network). When you're working on a particular problem, then you can consider the operational relationships. This doesn't mean that you can't have encapsulation and reuse through OO, but it means that your data isn't bound to the methods -- the methods are bound to the data as needed.

However, in an ideal system where all the source code of interest is available and clearly-written, and where rapid refactoring is not a problem (this is where schema evolution comes in if you're using an OODB, although it's not really up to scratch in existing OODBMSs yet), you can use information hiding to separate concerns to make the system easier to understand and review, and refactor as needed to deal with changing requirements. Information hiding is not meant to make people's lives harder pointlessly, it is meant to cleanly separate concerns and thus both simplify coding and reduce errors. If you have the source available (and you have either schema evolution support or no persistent instances of this class) you are not forever locked out from the private members of a class - you can modify the class.

Schema evolution isn't the problem. It's the binding that's the problem. The example I gave in last year's comment was the relationship between an Account and its Packages. In a relational database, there's a many-to-many relationship modeled with a third table, an AccountPackage table. If you're developing such a thing for a program, you're probably going to consider that the Account owns a set of Packages (a containment relationship). You go through your program development, and then a new requirement comes along: you now have to be able to determine which accounts have a particular package.

Because you've bound your account/package relationship to the account, you now have to go through all accounts to determine which accounts have which package. You could fix this by adding a set of Accounts to Packages, but now you're storing the same data in two places. The other solution, of course, is to promote the AccountPackage relationship to a first class object -- which means that you've just replicated the very same thing that a relational database does. Because you (hypothetical you, of course) considered the relationship in terms of operation (an account owns a package and operates on its set of packages) instead of relationship, you've had to go through a significant amount of work to be able to do what should have been a simple task.

I appreciate that many real-world systems do not approach this ideal system. Of course, mixing multiple languages, or freezing interfaces too early, act as impedances to refactoring.

Really, there is a way to make this better. It involves the data + metadata that I mentioned. It should be possible to create an object based purely upon reflection (or properties) as defined by the metadata and operated on by the data. Metadata doesn't have to just include the data types, it can include simple instructions limiting what sort of operations can be performed on the provided data.

The real problem I see with the standard OO advocacy is that it says that data+methods = object, but forgets that there are many ways of viewing the data.

If you need more views, use wrappers or modify the original class.

Neither of which is necessarily possible nor desirable. Indeed, in the AccountPackage situation, a wrapper class would do nothing to fix the situation -- and modifying the original class becomes a significant matter because interfaces end up changing (at least to some degree). If your basic object is simply an object with properties (analagous to C# or Delphi properties), then you're essentially writing an operational wrapper in every case. If it isn't that simple, then your object is going to grow very quickly into something horrendously large as more operations on the data become required.

See this comment I made last year.

I don't understand that comment. What do you mean by "child"?

I'm not quite sure, to be honest, since I used it in two different ways. AccountPackage would simply be a relationship table between Account and Package (I know the right term, but it's late); the common way of choosing to store the information would be to use a containment relationship for Package to relate it to Accounts within OO. Hopefully, I've made myself clearer in this comment.



[ Parent ]
Problem Domain vs Solution Domain (none / 0) (#34)
by codemonkey_uk on Mon Feb 04, 2002 at 05:18:31 AM EST

You seem tobe muddling solution domain and problem domain. Quite often, data exists in the problem domain, that is, for example, and customer account object is directly mapped to the real-world business customer account.

This customer account exists as a problem domain object. Now, when creating an application, which poerforms operations on this problem domain object is is quite normal to consider it from the point of view of the solution domain for the application. Be it C++, Java, or another object orientated language.

Now, the key is to not forget that it exists in both locations. In the problem domain be have the customer account. In the solution domain that might be class CustomerAcccunt. And object that supports serialisation, exporting to, say XML.

The key point to remeber here is that a single problem domain can map to multiple solution domains. If you forget that and tie your thinking about the problem, to a single solution, you might be cutting off your nose to spite your face.

To summerise: Implementation (Solution Domain) Objects are not weakend by Problem Domain Data. Infact, by continuing to consider Problem Domain Data you open up the posiblity of multiple solution domains.

Further Reading: Multi-Paradigm Design for C++
---
Thad
"The most savage controversies are those about matters as to which there is no good evidence either way." - Bertrand Russell
[ Parent ]

Second response... (3.00 / 1) (#20)
by aziegler on Fri Feb 01, 2002 at 10:48:01 PM EST

I think I see where we're having different discussions. The way that I design my work, I don't rely on globals (I have a bit of refactoring to do on my current project, as the IDE love globals, and they're potentially bad news because I may not have access to them in certain execution states), but I also don't pass around "a million arguments." Rather, I'll use a singleton object when I absolutely have to have something that is global and painfully common, or I'll use some sort of object or structure to pass around the most common information necessary to keep state.

As alader says in comment 16, it's not that I don't keep state -- it's that I don't force state outside of the object that I'm working in, in as much as I can help it. I'm trying, in fact, to avoid side-effects by keeping state where it belongs.



[ Parent ]
singletons - the acceptable global of the late 90s (none / 0) (#33)
by codemonkey_uk on Mon Feb 04, 2002 at 04:57:32 AM EST

Singletons are the most overused design pattern ever.

A singleton is a just global object with tighter controlled creation/destruction semantics, and that's all there is to it.

Do you really have that complex a start up / shut down procedure that you need singletons? Or are you just using them as an "acceptable global"?
---
Thad
"The most savage controversies are those about matters as to which there is no good evidence either way." - Bertrand Russell
[ Parent ]

I don't get it (4.00 / 5) (#4)
by core10k on Fri Feb 01, 2002 at 02:11:36 PM EST

How does this refer to stateful vs stateless software development? As far as I knew, stateful meant 'real software that doesn't take an eternity to write' and stateless meant 'crufty low-functionality web development.'

I think I'm either A)missing the joke or B)have my definitions wrong



You're on the right track (4.66 / 6) (#5)
by Shimmer on Fri Feb 01, 2002 at 02:21:11 PM EST

In a nutshell:

The stateful camp advocates "old-fashioned" object-oriented programming. Basically this means that objects need to live somewhere (either on the server or on the client) for some period of time -- perhaps longer than a single transaction.

The stateless camp believes that stateful applications do not scale well in the web-ified, distributed world. Essentially, each object should be broken into two pieces: data (represented via XML), and functions that operate on that data.

The stateful people (I'm one) feel that this is actually a step backwards, not a step forwards.

Hope this helps.

-- Brian

Wizard needs food badly.
[ Parent ]
woefully misguided (3.80 / 5) (#16)
by alader on Fri Feb 01, 2002 at 10:02:34 PM EST

I'm groaning because this is so painful. I'm one of shimmer's favorite debaters; we are colleagues. And we like to argue this issue to death. For some reason, there seems to be in the minds of those on the "stateful" camp that those of us in the "stateless" camp are against "old-style" programming, or against OO, or just have it out for objects in general. Puhhleeeaaase!! Nothing could be further from the truth. First off, I hate, let me repeat, hate, the name of this argument. We are not debating state. State is required always in some form or another while a user is engaged in your application. Period. No exceptions. How you store state and where are the real issues. The "stateful vs. stateless" title for this discussion has the unfortunate side-effect of inviting discussions that miss the target entirely. Second, stateful architectures can scale in the web world, but stateless architectures tend, let me repeat, tend to scale better. There are instances where the reverse is true, of course. But in most cases, stateless scales better. Sorry if this rumples feathers. This is just a tendency, so don't get over-excited. Third, objects are used within web services, or other stateless architectures. Just because we choose to model the real world versus the ideal world (now I'm pushing Brian's buttons here), does not mean that those of us in this camp don't like objects. We just don't feel the need to model every frellin' object in the world when we design are smaller-problem-domain web services. Behind the castle walls that protect the inner-workings of web services, there may be lots of objects -only we design just the classes we need, without anticipating every possibility of their reuse. But it is the gatekeeper, the protecting wall, which is exposed as the web service that often seems to mislead those in Brian's camp. For example, recently in our company, we debated the concept of two classes: an employee class and a certification class. We were arguing about where to place a method that returns all of the certifications for a particular employee, and where to place a method that returns all the employees that have passed a particular certification. The debate raged on when I voiced my opinion. I suggested that we create one class, EmployeeCertification, that had both methods. I came to this conclusion using the same thinking I use when I design web services. I boil the functionality I am delivering down until I have just the basic, raw purposes for building the application in the first place. Volia! This application needed to deliver two types of information. Why have two classes? So we can store their name and address in one, and the name and certification number in the other??? Let it go! Just let it go. Embrace the new millennium. Those in the opposing camp believe that because web services often pass data around using XML, all we care about is XML. Then they draw the conclusion that our objects are not really objects any more because they don't have data, just behavior. Blah blah blah, yawwwn. Not so at all. Objects may get their data from an XML string, and they may produce a new modified XML string as output, but what goes on inside is stateful, OO 1980's, 1990's style, the same boring stuff that those "stateful" objects do. The difference is the lifetime. So there's the next argument. Stateless architectures perform slower because they store their state elsewhere, like in a database or an XML file. Uhhhhh, what do you think those stateful objects do??? How do they persist their state when the machine falls over and is rebooted? They do the same frellin' thing, people. It's just about lifetime. Those who are more rigid, religious, or just plain uncomfortable with change, get all itchy and scratchy when you talk about objects whose state may live longer than a particular instance of the class. What's the big deal? Honestly? Clue me in, please. Okay, that's it for tonight. I think I gave you all enough to chew on and spit back. I look forward to the many rebuttals!! --Andrew

[ Parent ]
Stateful vs Stateless (3.50 / 2) (#8)
by MSBob on Fri Feb 01, 2002 at 05:39:35 PM EST

Well, I don't know your specific application but the stateful/stateless discussion only makes sense in certain types of software. When I worked on a desktop application it wasn't of any relevance or concern.

These days I do enterprise development and this issue has come up multiple times. Our application has state distributed across our components. We don't use EJBs but using the EJB technology I'd say that 70% of our business objects are like stateful session beans and only 30% are more like stateless beans.

I have a feeling that coding with the stateless approach offers some great benefits like easy to implement session failover (if you persist the state object every time you make a change to it) and optimizing database interactions should be easier because you have the single object in the session where the state is stored and can be traced and analyzed.

The drawback of the stateless approach is that the state has to be passed around and often include data that is meaningless to everything but some specific objects that happen to consume it. In other words encapsulation can be violated although you can remedy that with patterns like the Memento from the Gang Of Four. The other problem may be that persisting the state object may become expensive and/or cumbersome (think BLOBS). I can't really comment on other problems with the stateless model because our enterprise is stateful and I had never been exposed to a truly stateless app in the past. Maybe someone with more enterprise development experience can give us a summary of the cost/benefit ratio of a truly stateless model.

I don't mind paying taxes, they buy me civilization.

Any links... (3.00 / 1) (#11)
by greenrd on Fri Feb 01, 2002 at 05:57:57 PM EST

...advocating stateless development, anyone? I'd like to hear the other side of the story - something more than a soundbite.

This might help (3.00 / 1) (#14)
by Shimmer on Fri Feb 01, 2002 at 07:42:04 PM EST

Unfortunately, there don't seem to be a lot of clear-cut position papers on the topic. Most folks simply assume one position or the other.

Anyway, here's a typical article that takes the stateless side. Many fans of XML take this viewpoint for some reason.

-- Brian



Wizard needs food badly.
[ Parent ]
Here's the traditional example... (5.00 / 4) (#24)
by mech9t8 on Sat Feb 02, 2002 at 02:14:23 AM EST

...of why stateless is good.

Consider a shopping cart application.

Consider thousands of users using the shopping cart application.

In a stateful way of doing things, when a visitor puts something into his shopping cart, a shopping cart object is created on the server. Subsequent additions, manipulations, etc etc, are implemented by manipulating that shopping cart object.

So, for every visitor to the site, a shopping cart object is created. If the customer completes a purchase, the shopping cart is destroyed, and the memory freed. Otherwise, it stays there forever. So an arbitrary expiration period must be added. Still, the number of visitors is limited by the memory of the machine - and the memory of the machine limits how long these shopping cart objects will last. Customers want to do something else for 10 minutes and then continue shopping? Tough, if we let the sessions last that long the server will be overwhelmed...

For a stateless implementation, those limitations are eliminated. The information for that visitor can be stored on the client (in which case you can have a infinite number of visitors) or in a database (in which case you're limited by the database). You only need enough objects on the server to handle current requests, not one for every visitor in the last ten minutes. The trade-off is, of course, that every time a page is visited, more data has to be transferred and a bit more processing has to be done, so fewer transactions can be handled per second...

which leads when it become really important: when scaleability enters into it. If you've got a server farm, and you're using stateFUL objects, every request from a particular user has to go back to the particular server where the object was created server - if its busy, too bad, there's no way to balance them. With stateLESS objects, you can just keep adding servers - the database becomes the limiting factor, and databases are much more scaleable. Which makes managing the server farm a lot easier.

In many ways, stateless is a throwback to procedure-oriented programming. But it's simply needed handle the needs of high-traffic, indefinite-lifespan applications like web design.

There's also another, more philosophical, argument. That is: traditionally objects are things which manipulate their own data. You say, "Hey shopping cart, add this item and then process the order." Another way of thinking about objects, which is more towards the stateless philosophy, is that objects are things which manipulate data, instead of container the data themselves. The data is a thing on its own, which can be manipulated by a bunch of objects. So the shopping cart is the bit of data, and you give it to the Adder object to add an item, the Cashier object to process the order, etc. That way the different elements could be in totally different servers: the Adder object could be on the web server, for example, whereas the Cashier object could be on, say, PayPal's server. Or something like that - these examples are a bit simplistic, but hopefully I've made the point clear...

Of course, that's not necessarily an argument for Stateless vs. Stateful, as the Adder object could hold on to the shopping cart for a while until it's done adding stuff, or the Cashier object could hold onto the data for a while until the customer is done entering his information. But it lends itself to a stateless implementation. Get the shopping cart, have the customer enter the data, then feed them into one method on the Cashier object to process the order.

Anyway, hopefully those are the basic points for stateless development: the important one is scaleability, not being limited by the number of objects one can create. The secondary one is the whole idea of having data as something which is manipulated by multiple objects.

--
IMHO
[ Parent ]
Virtual memory? (3.00 / 1) (#28)
by greenrd on Sat Feb 02, 2002 at 02:33:30 PM EST

Still, the number of visitors is limited by the memory of the machine - and the memory of the machine limits how long these shopping cart objects will last.

This might be being a bit naive - but couldn't you just allocate a sufficiently large amount of swap space? Yes, if you run out of real memory you will have to page data in and out to disk - but you would have to do that anyway with a database, or else send it to and from the client (maybe even slower).

I know existing OS-level virtual memory implementations are not perfect in terms of efficiency - it would be useful to have a facility to "hint" to the VM that a certain area of memory should be paged out first, because it is less critical.

every request from a particular user has to go back to the particular server where the object was created server - if its busy, too bad, there's no way to balance them.

If you balance incoming sessions equally, surely that means (common sense) that if a client with an existing session cannot connect to server A, it's likely (but not certain) that B, C and D will also be saturated. Yes it is slightly suboptimal, but if you get to that situation in the first place it surely indicates you are close to saturating your entire system, therefore you need either new hardware or you need to increase simultaneous connection limits.

Of course, that's not necessarily an argument for Stateless vs. Stateful,

It sounds like an argument for procedural programming! Your {Cart, Adder, Cashier} example doesn't contain any obvious uses of encapsulation, inheritance or polymorphism. There's nothing necessarily wrong with that for simple projects - but without using those features you're not really doing proper object-oriented programming at all. C structs / Pascal records are not objects.


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

Well... (4.00 / 1) (#29)
by mech9t8 on Sat Feb 02, 2002 at 06:10:11 PM EST

1. Database servers are just *way* more efficient at the whole storing stuff on the hard drive idea than any current VM, with their optimizations, indexes, etc etc. They're just designed to do it. VMs are not. (And re-writing the kernel is just a bit out of the budget of most IT developers.<g>)
2. Perhaps... but it takes extra effort to make sure everyone goes back to the server they first used and adds another layer of complication and information that needs to be stored.
3. Contains no examples of such things, but doesn't necessarily exclude them, either. For instance, although the request to the PayPal server may contain only one function (as that is the most efficient way to transfer the information over the internet), there's no reason everything done at the PayPal end isn't fully object-oriented.

If you evaluate whether to use stateless or stateful programming based on whether they're "proper object-oriented programming", you're (a) going to choose stateful, and (b) using the wrong criteria. The choice between stateless and stateful (as with all programming choices) is based on what's best for the job - given the needs and available capabilities, what's the best way to handle this? Making a stateful application might make a beautiful programming architecture, but can completely neglect the physical and practical needs of the situation. The "properness" of your programming architecture is just one criteria of many when making programming decisions. The trick is finding the balance between maintainability, practicality, scaleability, and pure performance.

--
IMHO
[ Parent ]
Why not a hybrid solution? (4.00 / 1) (#30)
by ksandstr on Sun Feb 03, 2002 at 11:00:52 AM EST

I may be replying to the wrong sub-thread. Sorry. This feels like a good place to do so, however.

It sounds to me like the "sides" in this "stateful/stateless" [un]holy war are both pushing their own approaches to the problem (storing bunches of more or less relational data, and performing associated operations on them) while forgetting about the good bits of both. I mean, CORBA (just to name an example; I'm sure other types of middleware have similar features) has this thing called a servant locator which lets object servers transparently deactivate, serialize and store objects when they aren't found important enough to keep immediately available. These objects then get transparently reactivated when a new operation comes down the pipe, where necessary.

To me, techniques like these sound like something that would be very close to capturing the best sides of both approaches. Unfortunately, this also requires the writing of de-/serialization routines, which though tedious should then perform much better than the page-oriented virtual memory subsystem present in most operating system environments.



Fin.
[ Parent ]
WTF? (none / 0) (#32)
by codemonkey_uk on Mon Feb 04, 2002 at 04:34:55 AM EST

Sounds like confused terminology to me. Its not stateless vs. statefull, but state-stored-on-server vs. state-stored-on-client.
---
Thad
"The most savage controversies are those about matters as to which there is no good evidence either way." - Bertrand Russell
[ Parent ]
agreed... (none / 0) (#35)
by kubalaa on Mon Feb 04, 2002 at 05:51:45 AM EST

And I'm mystified by the references to XML. "Stateless" vs "stateful" should probably be translated into "side-effectful" vs "functional".

[ Parent ]
Not really... (none / 0) (#36)
by mech9t8 on Tue Feb 05, 2002 at 08:24:25 AM EST

Considering stateless also includes storing state information in a server-side database or file. It's a matter of whether the objects themselves store state; from a user's point of view, a stateless application stores state, but from object's point of view, every transaction is completely separate from the others. It just performs one operation on a set of data.

Therefore, its a matter of, from a programming perspective, whether the objects themselves store state or not. Hence, stateless and stateful. (The same way object-oriented has nothing to do with whether the end users are actually manipulating objects or not.)

--
IMHO
[ Parent ]
Which is better - stateful code or stateless code? (4.33 / 3) (#23)
by ariux on Sat Feb 02, 2002 at 01:31:33 AM EST

For that matter, which is better - food or water?

I mean, really.

in the tradition of 'vax wars' and 'unix wars' (2.50 / 2) (#25)
by turmeric on Sat Feb 02, 2002 at 02:55:00 AM EST

excellent except that it needs to be a lot longer.

Debate is misnamed (5.00 / 3) (#31)
by Jackster on Sun Feb 03, 2002 at 03:59:14 PM EST

Code without state will always produce the same output given an input. Nobody in the state-versus-stateless debate actually writes stateless programs. The issue they are debating is where to store the state.

"Stateful" programmers seem to view stateless ones as throwbacks to the procedural programming days. Being unaquainted with today's actual technology in stateless development, I'm not sure if it deprives programmers of the benefits of OO programming. But, theoretically, I don't think it would have to. Storing object data and methods in two places and recombining them on demand isn't incompatible with object orientation.


A Software Development Play in One Act | 36 comments (31 topical, 5 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!