Domain-Driven Design
> >
HistoryOfTransparentPersistence

Hi Randy

I have huge respect for Gemstone and the transparency they achieved. I
would love for JDO to be able to do the same. We actually had a separate
JSR for adding persistence hooks into the JVM itself, but that was not
adopted by the JCP.

For reference, the enhancement process currently employed by JDO serves that
purpose. Since the JVM cannot tell us when a field is referenced (so we can
lazily load it) or when it has been changed (so we can maintain our list of
dirty fields and dirty instances) we have to enhance the bytecode so that
the domain object will invoke its StateManager when necessary. If we had
appropriate persistence hooks, JDO would not need the PersistenceCapable
interface or the enhancement tool that implements it for already-compiled
classes.

I do not claim we are the first to "transparent" persistence, or that JDO is
100% transparent.

RobinRoos



In my experience, I don't /want/ transparent persistence. I /want/ to
say when "save" is called, because otherwise I am shackled by the
persistence layer's decision when to save my data. Maybe it saves too
often; maybe it saves too rarely. I think of this specifically in the
context of early entity beans/EJB.

Am I alone here? Is transparent persistence /really/ all it's cracked up
to be, or is it just another place where I have to rely on someone else
to "get it right"?

JBRainsberger


Hi J.B.

In JDO you say when to save, either by calling makePersistent(o) to persist
a new object that is not referenced by any other already-persistent object,
or by calling t.commit() after making an object dirty (by altering its
state, which includes changing the references it holds to other
persistence-capable instances. On commit, the dirty fields of dirty
instances are flushed to the database, and the reachability algorithm takes
care of any transient persistence-capable instances which are now reachable
from persistent ones, and makes them persistent. This functionality is
datastore-independent and guaranteed across all datastores. The JDO
implementation may flush to the datastore before commit time for various
reasons, but these are usually configurable.

There are two areas in which JDO could be more transparent. Firstly, it
would be great if the enhancement step was not needed. Without JVM hooks
that is difficult to achieve. Secondly, it would be great if we could
persist ANY class; at present we can only persist classes you markup as
persistence-capable, and those System classes which your JDO vendor has
chosen to support. So if you suddenly acquire a new concrete collection
class, which your JDO vendor does not yet support, then you might have to
explicitly mark it up as persistence capable or create your own subclass
which is marked up in this way. In a Gemstone-line environment, both of
these would have been automatic.

The client still has complete control over which objects become persistent
or are deleted, and the transaction in which this takes place.

RobinRoos


I've been mired in WebSphere v5, then JUnit for the past eighteen
months, so I haven't been able to do any research in this direction, so
the information is useful. I admit that my reticence to embrace
transparent persistence is due to EJB/CMP, and perhaps that's judging a
population by its biggest idiot. :)

I am back to working on a real project and I am intentionally deferring
my persistence mechanism to spend more time on the domain model and also
to play around more with Fit/FitNesse.

I need an intern to go evaluate all this stuff for me. I wonder if any
university students are looking to volunteer. :)

JBRainsberger


I'm sure that the JFluid team would love to have your support for
their request to enhance the JVM to support such things:

http://developer.java.sun.com/developer/bugParade/bugs/4879835.html

SUN Bug ID #4879835: "Provide the dynamic bytecode instrumentation
capability, as found in JFluid"

JeffGrigg


Certainly I need control of the transaction boundaries (when to commit,
etc.) If the layer was smart enough, I'd give up control over "save". It
seems like a good framework could let me tweek that in the configuration of
the mapping layer, rather than surfacing it to the application. But if the
framework can't do it right every time, then it needs to give me at least
an option to override.

Even then, it is the application layer and not the domain objects that are
aware of this, right?

EricEvans


Hi Eric

Technically one could have a domain object demarcate a transaction in which
it then made itself dirty and then committed. I feel this would be very bad
form. A better design is the one you intimated, with domain objects doing
their domain logic unaware of transaction boundaries which are imposed by
application objects which are themselves non-persistent.

As far as when data hits the datastore is concerned, the application you
write should have little or no coding specific to this issue. You can
invoke a flush(), but you don't need to. The JDO implementation which
manages interactions with the database will flush automatically when it
deems it appropriate. One time this happens automatically is when you run a
query; if the appropriate configuration parameters were set, then the dirty
instances in your current transaction are flushed to the datastore prior to
query execution. This allows the datastore's query engine to correctly take
account of changes you are making. It preserves transaction isolation. But
it doesn't compromise transaction demarcation, as items flushed to the
datastore are not yet committed, and will be rolled back if your transaction
fails to commit when you tell it to or is rolled back deliberately.

Even this level of flushing can be configured, but good application design
sees this as orthogonal to both domain logic and application logic. You
should see no functional change to your application if you choose to invoke
cache management services such as flush() and evict() yourself, but
depending on the deployed environment and the application use case you might
see performance and/or degradation as a result of programmatic cache
management. An example is evicting each instance resulting from a query
execution after you have processed it. By evicting the instance sooner
rather than waiting for the implementation to choose to do so you reduce
your overall resource utilization, but the application is functionally
identical.

So, to change data just change it. To save changes just commit the
transaction. Only in the case that you are making persistent a new object
that will not be referenced by another already-persistent (or
newly-persistent) object at commit time must you actually invoke persistence
services such as makePersistent(o) explicitly.

My train is approaching Gatwick Airport so I must post this and leave you.
I'm in Texas for a few days before getting back to London later this week
and my internet access may be more restricted than usual, but I'm sure this
discussion will carry on very happily in the interim.

RobinRoos


> Certainly I need control of the transaction boundaries (when to commit,
> etc.) If the layer was smart enough, I'd give up control over "save". It
> seems like a good framework could let me tweek that in the configuration of
> the mapping layer, rather than surfacing it to the application. But if the
> framework can't do it right every time, then it needs to give me at least
> an option to override.

I am probably confusing "save" with "commit". I probably mean "commit".

> Even then, it is the application layer and not the domain objects that are
> aware of this, right?

Indeed.
JBRainsberger


> In my experience, I don't /want/ transparent persistence. I /want/
> to say when "save" is called...

Gemstone still puts "commit" in the programmer's hands. Even so, I
think there are some problems with transparent persistence.

It's great for getting started quickly. It's great for stand-alone
applications. But...

(1) It's not well funded and suffers from lack of R&D relative to
relational databases.

(2) When the runtime is transactional, every interaction with another
transactional space is essentially a 2 phase commit.

(3) As the system grows in size and users, you have to pay ever more
attention to your graphs of objects. Not a bad problem to have if
you are aware of this as you proceed, but I think generally there
are diminishing returns.

(4) Not really a problem, but something to be aware of... you should
still design for OLTP vs. OLAP. Large volumes of regular data is
not the strong suit of transparent persistence, in 2003 at least.

> Am I alone here? Is transparent persistence /really/ all it's
> cracked up to be, or is it just another place where I have to rely
> on someone else to "get it right"?

Generally I think the industry needs to modify the *database* to
better support objects and O/R mapping. SQL and it's data types need
revamping. Graphs, caching, querying, transactions, etc. need better
support to reduce programmer burden without going all the way to
completely transactional run-time images.

Ultimately for business processing and analysis I think better mapping
language and stream/event-oriented processing is desirable.

Transparent persistence is very appropriate for "persistent state
machines". I think something like Prevayler
(http://www.prevayler.org/wiki.jsp?topic=StartingPoints) could be more
tenable than an expensive, proprietary solution, though.

Over the next five years it's pretty clear we'll see ever more
hardware systems using MRAM
(http://computer.howstuffworks.com/mram.htm), which means essentially
transparent persistence for free. This will change everything,
utlimately. So OLTP/OLAP divisions, streaming/events, and
Prevayler-like approaches to persistent state machines will pay off
handsomely. (IMHO!) You'll be able to throw out some unused code instead
of adding more.

PatrickLogan


> [...] I think there are some problems with transparent
> persistence.
> [...]
> Generally I think the industry needs to modify the *database*
> to better support objects and O/R mapping. SQL and it's data
> types need revamping. Graphs, caching, querying, transactions,
> etc. need better support to reduce programmer burden without
> going all the way to completely transactional run-time images.

I generally think that object-oriented databases (and possibly use of
object prevalence technology for operational databases of moderate
size) are the answer.

But the industry has voted with its dollars that they want RDBMS
servers, and will tolerate nothing else, regardless of the
consequences. I think that being ignorant of the alternatives is
probably the leading contributor to being blind to the consequences.

I do often tell people that the moment you mention object-relational
mapping, you have to face the fact that you will pay a performance
penalty for it. They never believe me, but I keep saying it. (And
they keep experiencing it. ;-)


Object-Oriented Databases:

http://makeashorterlink.com/?T2BE22DE6 =
http://directory.google.com/Top/Computers/Programming/Languages/Java/D
atabases_and_Persistence/Database_Management_Systems_-_DBMS/Object-
Oriented/

Object Store:
http://www.objectstore.net/products/objectstore/index.ssp


> Transparent persistence is very appropriate for "persistent
> state machines". I think something like Prevayler
> (http://www.prevayler.org/wiki.jsp?topic=StartingPoints)
> could be more tenable than an expensive, proprietary
> solution, though.

If you hadn't mentioned Prevayler, I would have. ;->

> Over the next five years it's pretty clear we'll see ever
> more hardware systems using MRAM
> http://computer.howstuffworks.com/mram.htm [...]

Hmmm... Most interesting. When can I buy it? ;->

JeffGrigg




HistoryOfTransparentPersistence is mentioned on: ThreadView


VeryQuickWiki Version 2.6.3 - HTML Export