logo       

Sponsor
FREE Network Mapping Tool for Microsoft® Office Visio® Professional 2007
Don't map your network by hand - let LANsurveyor Exx press for Microsoft Visio Professional 2007 automatically create network diagrams for you!

Re: Join order and access path: msg#00299

apache.db.derby.devel

Subject: Re: Join order and access path


1. Since you wrote the original optimizer if you were currently the lead
architect what would your recommendations be for enhancing or improving the
optimizer?

Ok, Dan, shove over!

It's hard to consider the optimizer by itself. Many optimizer enhancements would work with changes in other areas, especially execution.

One area that I think needs work is hash joins. The current implementation uses an in-memory hash table. The optimizer avoids using the hash join strategy when it estimates that the hash table would use too much memory. There are a few problems with this: the optimizer doesn't really know how much memory is available, its estimates of memory usage are crude and possibly inaccurate, and it's possible for the query to fail if a hash table gets too big during execution.

I would like to see hash tables that spill to disk. Ideally, the hash table should be an in-memory structure with a conglomerate as a backing store. I would want the backing store to be used only when necessary - that is, only when the hash table grows too big. The biggest problem with this idea is how to estimate the cost of building and maintaining the table. One approach could be to put a limit on the number of in-memory rows in the hash table and use a statistical formula for the cost of reading a row, using the number of in-memory rows versus the total number of rows to estimate the chances of finding a row in memory.

Another approach could be to use weak references in managing the hash table (a weak reference is a Java construct that allows the garbage collector to nominate an object for garbage collection even when it has a reference to it). Weak references are useful for memory caches that adjust themselves to the memory available to the JVM. One of our original ideas for Derby (nee Cloudscape) is that it should be a low-maintenance DBMS, with little intervention required to keep a working system running. A self-managing cache could help with this - it would adjust itself to environments with different amounts of memory (although small-memory environments could hurt performance). I don't know how the optimizer would estimate the cost for building and maintaining a hash table in this case.

I also think merge joins are worth considering, especially if nothing is done about hash joins. Merge joins are useful for many of the same types of queries as hash joins and, since they use the sorter (assuming the joined rows are not already ordered on the join colums) they can work even for large tables (because the sorter spills to disk if the data being sorted won't fit in memory). Merge joins can have a very low cost if the rows are already ordered (which they can be if there are indexes on the join columns). Merge joins can also work well with sort avoidance if the ordering for the merge is the same as for the ORDER BY clause.

Switching gears, another problem is the cost of user-defined functions. What do you do with a query like this?:

select *
from t1, t2, t3
where t1.c1 = user_defined_function(t2.c2)
and t2.c3 = t3.c4

If the user-defined function is cheap and there's an index on t1.c1, you may want to call the function for every row in t2 and use the result to probe into the index on t1. On the other hand, if the function is expensive, you may want to try to execute it as few times as possible, which could make it unfeasible to use it to probe into t1. Currently Derby has no modeling for the cost of user-defined functions and avoids pushing them down into the query plan (that is, it calculates user-defined functions as late as possible before returning the rows from the query).

This may seem trivial, but keep in mind that a user-defined function can do anything, from something as simple as returning a constant to as complex as executing a query on another DBMS. It really can be important to know the cost of a user-defined function.

One possible approach would be to have a way of telling the DBMS to execute the function and remember how long it takes, and then store this in a system catalog for the optimizer to use. Another approach would be to allow the user to register the cost of a function as low, medium or high.

Switching gears again, another feature I think would be generically useful would be indexes on expressions (instead of just on columns). One potential use for this feature is case-independent searches (which can be done now, but which tend to be slow because the functions involved prevent the optimizer from using an index). The biggest problem here is the representation of index definitions in the SYSCONGLOMERATES system table (which assumes that an index column corresponds to a column in a base table).

Another area for investigation is the flattening of subqueries to joins. Currently, the language compiler flattens IN and EXISTS subqueries to types of joins. This is good because it gives the optimizer more options in choosing query plans for these types of subqueries, and it also allows the optimizer to get better cost estimates (for complex technical reasons I won't go into here). There are other types of subqueries that could be flattened - for example, a NOT IN or NOT EXISTS subquery can often be flattened to an outer join.

Another thing that could be done is to allow the optimizer to invert the join order of IN and EXISTS subqueries. As mentioned above, these subqueries are often flattened to joins. The joins are special, in that at execution it looks for only one match in the inner table per row from the outer table. This strategy requires that the table in the IN or EXISTS subquery remain inner to the joining table from outside the subquery. It would be possible to invert the join order if a sort were done on the subquery table to eliminate duplicate joining values. (Actually, I'm oversimplifying here, but I would have to write a treatise otherwise.)

2. What other optimizer architectures or features did you consider and
reject during the original design and development? Would you recommend any
of these for reconsideration now?

The original optimizer was bare bones - we needed to get some sort of cost-based optimizer working in just a few months. I was working in a hurry and, as a result, the optimizer is not as extensible as I would have liked. So far, Derby supports only one type of index (B-Tree) and there is probably a lot of code that assumes this and isn't well-modularized. Ideally, it should be possible for users to add their own types of indexes without changing much in the optimizer.

One really far-out idea I had in the early days of Cloudscape was to have the system measure the performance of different parts of the query at execution time and store the measurements for the optimizer to use later. This way, the optimizer would "learn" and make better decisions as time went on.


3. Could hints be used to good advantage in Derby?

4. What type of hints might be most effective?
A. Index hints (index: t1-zip_code)?
B. Join order hints (join-order: t3, t2, t4, t5, t1)?
C. Join type hints (join-type: hash-table/nested-loop/other)?
D. Plan_table hints? Store an execution plan in a table to always used
for a particular statement
E. Other hints?

5. What 'hint' implementation approach would you recommend given the current
architecture? Are changes needed to the architecture to effectively
implement your recommendation?

6. Would it be possible to implement a 'force' option that would force Derby
to use a particular hint rather than simply 'suggest' an approach? This
could be very useful especially during dev/test: it may be harder to compare
two execution plans if you can't force Derby to use each plan for exactly
the same set of conditions.

I'll answer all of these questions at once. I originally implemented the force option approach. I believe this is the best way to do it - the main purpose is to allow users to fix problems when the optimizer doesn't do what he or she wants. I also believe that, if the optimizer can choose it, it should be possible to over-ride it. That is, it should be possible for the user to completely wire up a query plan.

A few different approaches have already been discussed here. I originally implemented optimizer hints (actually commands, rather than hints) using PROPERTIES clauses in the FROM list. Some people object to this approach because it forces non-standard syntax. Some have suggested putting the hints in comments, while others (Dan, especially) thought it would be a good idea to put them in a separate place (like in a properties file). Of these, I prefer putting the hints in comments. There are a few problems with storing them apart from the query: since the lookup key would presumably be the query text itself, any change in the query would cause the hints to no longer apply; one would have to come up with a way to tell the optimizer which parts of the hints should be associated with which parts of the query (a query can have many FROM clauses, and the same table (and even correlation name) can appear many times in the same query); putting the hints far from the query creates a pitfall for anyone who doesn't know about the hints.

The two problems I can see with putting hints in comments are that the Derby comment format (based on the SQL standard) is not conducive to this (i.e. there's no way to put a comment in the middle of a line), and it could be hard to get the hints from the lexer (where comments are processed) to the parser (where the FROM clause syntax is recognized).


7. What information is currently available from the optimizer, via logs or a
debug setting, to show the explain plan and/or list the options (e.g. join
order) that were considered, including options that were not chosen? Is
there additional information that would be useful that could be easily
extracted?

I originally implemented a trace option to make the optimizer print out all of its cost estimates and decisions as it went. It produced a *LOT* of output. I see that the code is still in there, but I don't know whether it still works.


8. Are there others database, table or index statistics that could be
captured that might faciitate optimization?

As I mentioned above, statistics on the cost of user-defined functions could help. Statistics on the costs of joins could help.


9. Other possible plan-table strategies? For example, store in a plan_table
information about table relationships (master/detail, fact/dimension) so
that certain table combinations are always joined the same way regardless of
the query?

This goes against the concept of a cost-based optimizer. I prefer making decisions based on costs, rather than on hard-wired relationships. Optimizers can surprise you by coming up with better plans that you would yourself.


Sorry to barrage you with questions. No one's trying to 'rope' you into
rejoining the project.

"Just when I thought that I was out they pull me back in."

- Michael Corleone, Godfather III


- Jeff Lichtman
swazoo-KealBaEQdz4@xxxxxxxxxxxxxxxx
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/




<Prev in Thread] Current Thread [Next in Thread>
Sponsor
FREE Network Mapping Tool for Microsoft® OfficeVisio Professional 2007
Don't map your network by hand - let LANsurveyor Express for Microsoft Visio Professional 2007
automatically create network diagrams for you!
Google Custom Search

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation

Home | sitemap | advertise | OSDir is an inevitable website. super tiny logo