Posted on October 30, 2008 in Java, Software Development by Rob Di Marco9 Comments »

For the umpteenth time, I had to deal with the dreaded Java OutOfMemoryError, this time when trying to run Fisheye Subversion browser.  My problem deals with a known issue in Fisheye where a background indexing task sometimes runs causes an OOM error.  Of course, an OOM does not just effect the indexing task, it can also impact other operations unrelated to the task with the memory leak.  While diagnosing this problem, it got me thinking about how this could be better managed.

The Problem

 The JVM defines a single heap for all objects created across all threads in the java process.  A consequence of this is that if one processing thread has a memory leak, any thread on the JVM may start suffering from OOM errors.  However, not all threads are equally important to my application.  I may not mind too much if a batch processing thread fails with an OOM but I may care very much if the lack of memory causes all of my Tomcat threads to no longer be able to process web requests.

Proposed Solution

What I would like is to be able to segment my heap so that I can dedicate portions of the heap (either by percentage of the total heap or by absolute number of bytes) to a specific set of work.  That way if an OOM occurs, I can contain its impact to a certain set of threads while other threads continue processing.  I would see this being configured as JVM runtime -X arguments, something like -XHeapSegment:Name=MyMemHeap,Size=25%,Thread=<ThreadNameRegex>

I could see the partitioning being relative (by percentage) or absolute (total bytes allocated).  I could also see linking it to ThreadGroups as opposed to just Threads.

Some Questions

  • Do others see value in this proposal?
  • I am not an expert at JVM internals.  Is there a fundamental reason this would not work?
  • Any other suggestions?
  • Should we allow this segmentation to be defined at compile-time as well through annotations?

Notes

  • For the purposes of the problem I defined, I do not have a requirement that the young generations also be segmented.  However, if it is impractical to keep a single set of young generations and segmented older generations, I can deal with having segmentation in the young generations as well.
  • Obviously, this would in no way address all OutOfMemoryErrors.  We can still look forward to running out of PermGenSpace
Posted on October 29, 2008 in Software Development by Rob Di MarcoNo Comments »

I have been reading The Black Swan : The Impact of the Highly Improbable by Nassim Nicholas Taleb (a.k.a. NNT) and been thinking about its impact on some of the assumptions that I have made.  One of the main points is how often accept as proven fact theories which ignore silent evidence.  Software professionals fall for this kind of logical failure all the time.

Let’s consider a theory that ignores silent evidence.  I plan on voting for Barack Obama for president.  If you were to make conclusions from a poll that consisted ONLY of asking me who I would vote for, it would be logical for you to conclude that Obama is going to overwhelmingly win both Pennsylvania and the national popular vote. These predictions may very well prove to be accurate and you might be hailed as a tremendous prognosticaltor.  But most serious analysts would mock you.  They would be correct to do it.  Because of the miniscule sample size, your poll does not accurately reflect the size of the entire population that will be participating in the election.  The "silent evidence" are the opinions of all the voters who have NOT been polled.  A good pollster mitigates ignoring the silent evidence by trying to come up with a representative sample; the idea being that if you can get a large enough sample broken down by representative demographics, you can make a reasonable prediction.

So what’s the point?  Let’s consider another example.  In a famous essay, Paul Graham talks about the advantage of using functional programming languages when creating your startup.  If you haven’t read the post, I highly recommend it, it is well written and thought provoken.  However, it clearly ignores the silent evidence.  Graham takes his singular experience and makes a conclusion based off it.  But consider the silent evidence:

  • How successful of functional language based software startups versus all software startups.

or a slightly different cohort

 

  • How successful were functional language startups during 1995 (when ViaWeb was started) versus all software startups founded during that year.

With both these cohorts, we should probably include in the sample those kitchen table companies that started but never shipped a product.  Maybe many functional language startups fizzle quickly out and only the ones with really great ideas survive.

Of course, I could just as easily make the argument that statups that have two really, really smart co-founders who have impressive technical and business savvy and are starting their company in an exploding market have a much better probability of success than companies without those advantages.  My theory is no more right or wrong than the one posited above, we both are ignoring the silent evidence in positing our

My point is not to say Paul Graham is wrong about startups and functional languages.  He is a smart guy and has accomplished much more than I.  The Beating the Averages post was just the first one to popped into my mind while reading Black Swan, I’m sure I can find countless other technical articles.  But I do want to introduce some skepticism into the reading of his post and many other software legends.  When critically reading a book, article, blog posting, consider the silent evidence before making judgement on the theory.

Posted on October 23, 2008 in Software Development by Rob Di MarcoNo Comments »

Interesting article from fortune on why talent is overrated.  As a society, we tend to overvalue innate talent.  However, most success comes from focused hard work rather than pure talent.  Tiger Woods is great because of how focused he is on constantly improving his game.

As i talked about in Effective Technology Teams, what someone knows is much less important than what they can learn.  For further reading, I highly recommend the book Mindset by Carol Dweck.  Fascinating book that helped make this concept clear to me.