From: | "Selena Deckelmann" <selenamarie(at)gmail(dot)com> |
---|---|

To: | pdxpug(at)postgresql(dot)org |

Subject: | Relational Algebra Meeting Recap |

Date: | 2007-09-26 18:52:01 |

Message-ID: | 2b5e566d0709261152x34a23f19ye5b93fe9f5a7fb4@mail.gmail.com |

Views: | Raw Message | Whole Thread | Download mbox | Resend email |

Thread: | |

Lists: | pdxpug |

Hello 'PUGers,

Last week's meeting was all about Relational Algebra. And mixed

drinks. And our ONE YEAR ANNIVERSARY! That's right - PDXPUG has

been around for a full year.

Twenty-one people attended - with one intrepid traveller from Eugene,

and Alec, who is moving to Seattle. We hope Alec starts a PUG with the

Whitepages folks. We also had visitors from the PHP and Ruby

communities. And even Wil Cooley. Very good to see you all!

Several people brought mixers and raw material for the bar. In

particular, Jason from the Python Users Group, Gabrielle and Schwern.

Did I miss anone? Gabrielle served as bartender for over an hour.

Thank you all very much!

We started with a short discussion of HOT (which stands for Heap ONLY

tuple), courtesy of Jeff Davis. The bottom line here is that it is a

big performance enhancement for tables which are UPDATE'd frequently.

Rather than appending modified tuples to the end of an index, existing

tuples are updated in place - which preserves the order of an index

and greatly reduces the need for a table-wide VACUUM.

There is an extensive README available:

http://archives.postgresql.org/pgsql-patches/2007-09/msg00261.php

---

Rafael opened the Relational Algebra talk with a promise to explain

how all the theory he & James were about to dive into would ultimately

be useful for the practical work of designing databases.

Any discussion of relational theory is incomplete without a definition

from E.F. Codd: "Any instance of the relation is always a subset of

the cross product of its domains." Uh-huh.

Some definitions: every relation is a mathematical set. A relation is

an unordered set of tuples, AKA a table or the result of an SQL query.

A tuple is a single element of a relation - in database terms, it is

a row.

There was a short discussion about why mathematical definitions are

useful, and what the database community gained by leveraging proofs

developed by set theorists. I believe somewhere during the discussion

of bag theory, we had a Proof By Intimidation!

Rafael & James then discussed nine operators:

Project (the verb) - AKA SELECT in SQL

Select - AKA WHERE in SQL

Union (direct from set theory)

Intersection (direct from set theory)

Difference (direct from set theory)

Cross product

Join

Divide

Rename

The symbols and explanations for each of these can be found at:

http://en.wikipedia.org/wiki/Relational_algebra

One operator that is typically confusing is the Divide operator -

James summarized the Divide's basic function as "Don't give me what I

don't want."

After explaining all the operators, James dove into the practical

application portion of the talk. Being able to find equivalences is

the key to simplifying and optimizing queries. In particular,

eliminating sub-queries can give significant performance improvements.

The basic strategy behind query optimization is: Translate the SQL

query into a query tree using relational algebra operators; Generate

other, equivalent query trees; For each query tree - select an

algorithm for calculating the result and estimate the cost; Choose the

lowest cost plan!

This is what the PostgreSQL planner does, and understanding which

query tree is being used (through EXPLAIN!) in combination with the

algorithms used to fetch tuples is a valuable skill when designing

databases and queries. James also dug out an old joke from a meeting

last year about growing one's own coffee from the bean. Thank you

James for remembering our old jokes.

The meeting concluded, and several of us walked to the Lucky Lab.

--

Special thanks to Gabrielle for her notes and her duct tape.

--

Selena Deckelmann

PDXPUG - Portland PostgreSQL Users Group

http://pugs.postgresql.org/pdx

http://www.postgresqlconference.org

From | Date | Subject | |
---|---|---|---|

Next Message | Selena Deckelmann | 2007-09-26 18:53:36 | conference -- Lunch Options near PSU volunteer |

Previous Message | Rafael J. Fernández-Moctezuma | 2007-09-21 06:46:15 | Relational Algebra references |