Relational Algebra 关系代数
Relational Algebra
Relational Algebra is the mathematical basis for the query language SQL
Introduction.
So now you have learn how to design good conceptual models to store information with the ER-model
And you also know how to turn an ER-model into a Relational model
Let us jump the next step - which is defining the Relational Database and entering the data into the database - and look atdata manipulation
What can we do with the data ????
Relational Algebra: Overview
is amathematical languagefor manipulating relations (ala mathematical definition of "relation" - which is a subset of some cartesian product)
contains:
set operators
relational database specific operators
set functions
is used to specify a result set that satisfies a certain "selection criteria"
the result of an operation in relational algebra is always arelation
Set operations
You should all know whatset unionis:
SupposeA = {i, j, k}andB = {k, x, y}
Then the union ofAandBis{i, j, k, x, y}
Note that tuples in setsAandBmust have the sameFORMAT(same number and type of attributes) to be used in the union operation.
You should also have learned whatset intersectionis:
SupposeA = {i, j, k}andB = {k, x, y}
Then the intersection ofAandBis{k}
Note that tuples in setsAandBmust have the sameFORMAT(same number and type of attributes) to be used in the intersection operation.
You should also have learned whatset differenceis:
SupposeA = {i, j, k}andB = {k, x, y}
Then the difference ofAandBis{i, j}
Note that tuples in setsAandBmust have the sameFORMAT(same number and type of attributes) to be used in the set difference operation.
I have already discussed thecartesian product:click here
Set Functions
Operates on a set of values and produce a single value
To illustrate the various set function, weapplytheset functionson this set of value:
A = {1, 4, 9}
Results:
|
sum(A)= sum of all values ofA(answer= 14) |
Relational database specific operations
σ(Condition)(R) (selection)
Selects tuples from a relationRthat satisfies the conditionCondition
TheConditionexpression containsconstantsand/orattributesof relationR(the condition expressioncannotbe aset!!!)
The result ofσ(Condition)(R)is asubsetof tuples ofRthat satisfies the boolean conditionCondition
Examples:
Retrieve all employee tuples working for department number 4:
| σ(dno = 4)(employee) |
|---|
The following diagram shows the result graphically (hopefully, it will illustrates the concept unambiguously):
Retrieve all employees earning more than $30,000:
| σ(salary > 30000)(employee) |
|---|
Retrieve all employees working for department number 4 and eaning more than $30,000:
| σ(dno = 4 and salary > 30000)(employee) |
|---|
OR:
| σ(salary > 30000)( σ(dno = 4)(employee) ) |
|---|
π(attribute-list) (projection)
Selects out only theattribute valuesattribute-listfrom all tuples in relationR
Theattribute-listcontains the list of attributes in relationRthat will be selected.
The result ofπ(attribute-list)(R)contains the is asubsetof tuples ofRthat satisfies the boolean conditionCondition
Examples:
Retrieve all SSN from the employees
| πssn(employee) |
|---|
The following diagram shows the result graphically (hopefully, it will illustrates the concept unambiguously):
Retrieve thesexinformation from all employees:
| πsex(employee) |
|---|
The followingdiagramshows the result graphically (hopefully, it will illustrates the concept ofprojectionunambiguously):
Notes:
|
Theoutputof aRelational Algebraquery isset. AsetinMathematicsdoesnothave duplicate elements Therefore, the result ofπsex(employee)is theset{F, M}becauseduplicatesare removed InSQL, the user will have theoptiontoremove duplicates That is becauseremoving duplicatesrequirelonger timeto process.... (The user has theoptiontoincurthe cost ornot...) |
Combining information from 2 relations
Consider the following query:
| Find FName and LName of all employees in the "Research" department |
Observation:
|
We will need to combine information in theemployeeanddepartmentrelationsto answer the query. |
But we must becarefulin combining the information:
Employee: +-----------+--------+---------+-----------+-----+ | ssn | fname | lname | bdate | dno | .... (other columns omitted) +-----------+--------+---------+-----------+-----+ | 123456789 | John | Smith | 09-JAN-55 | 5 | | 333445555 | Frankl | Wong | 08-DEC-45 | 5 | | 999887777 | Alicia | Zelaya | 19-JUL-58 | 4 | | 987654321 | Jennif | Wallace | 20-JUN-31 | 4 | | 666884444 | Ramesh | Narayan | 15-SEP-52 | 5 | | 453453453 | Joyce | English | 31-JUL-62 | 5 | | 987987987 | Ahmad | Jabbar | 29-MAR-59 | 4 | | 888665555 | James | Borg | 10-NOV-27 | 1 | +-----------+--------+---------+-----------+-----+ Department: +----------------+---------+-----------+--------------+ | dname | dnumber | mgrssn | mgrstartdate | +----------------+---------+-----------+--------------+ | Research | 5 | 333445555 | 22-MAY-78 | | Administration | 4 | 987654321 | 01-JAN-85 | | Headquarters | 1 | 888665555 | 19-JUN-71 | +----------------+---------+-----------+--------------+ |
Howto"combine"tuples fromEmployeeandDepartment:
|
Only combine atuplefromemployeewith atuplefromdepartmentwhentheirdepartment numbersare EQUAL (Remember we used thedepartment numberinEmployeeas aforeign key!!! Thedepartment numberinEmployee"refers"toonetuple inDepartment |
Solution:
First find the department number of the "Research" department:
| R1= πdnumber( σ(dname = 'Research')( department ) ) |
|---|
Graphically:
Then wecombinethedepartmentandemployeethrough a"mindless" cartesian product operation:
| R2= R1x employee |
|---|
The result of this operation is"tagging" an extra column "dnumber"to theemployeerelation:
Nowwe can use aselect operationto select out the"right tuples":
Becausednumberis now anattributeof the (new)relationR2!!!
| π(FName, LName)(σ(dno = dnumber)( R2) ) |
|---|
Graphically:
Efficiency consideration of queries
Correctnessis themost important propertyof a query
Thesecond most important propertyof aqueryisefficiency
There maybemultiple queriesthat retrieve thecorrect set of tuples(answer)
The difference of the queries may be theirexecution speed
Consider the following solution:
| π(FName, LName)(σ(dno = dnumber and dname = 'Research')( employee x department ) ) |
|---|
Although thisqueryiscorrect, it will takelonger time to processbecause of thelarger amount ofintermediatedatathat the query produces:
Thecartesian productemployee x departmentwillproducehuge number of tuples- many more than in the previous solution.
Join operations
The join operation (bow tie symbol) is equivalent to:
Example:
Retrieve all employees from the "Research" department
slower (but correct)solution:
It is slower because thejoin operationwill result in more tuples asintermediate results
Thefaster versionis:
The "theta join" operation
The condition in the join operation is of the form:
Condition 1 and Condition 2 and .... |
Each condition is of the form"x θ y", wherexandyare attributes or constants
Andθis one of the following relational operators:
|
== |
A join operation with a "general" join condition (whereθcan be any relational operation), is called atheta join
The "equi-join" operation
Often(almost always :-)), we use the equal relational operation in the join operation:
Condition 1 and Condition 2 and .... |
whereeach conditionis of the form:
x == y |
wherexandyare attributes or constants
Equi-join:
|
Ajoin operationwhere the thejoin conditioncontainonlyequality compare operatorsis called aequi-join |
The "natural join" (*) operation
The natural join operation is anequi-joinoperation follow by a projection operation that removes one of theduplicateattributes.
Since the equi-join will for tuples when the attribute values are equal, all tuples that produced by an equi-join will have the same attribute value under the join condition attributes.
The natural join operation is denoted by a star (*):
R1 * attr1,attr2 R2 |
Outer-join
Outer-joins:
|
Anouter joindoesnotrequireeach record in the two joined tables to have amatching values. The rusulting joined table willretainsrecords fromoneorbothtables even ifno other matching value exists. |
The3 flavorsofouter-joins:
|
Leftouter-join:A ? B
Rightouter-join:A ? B
Fullouter-join:A ? B
|
Note:
|
Ifattribute namesare thesameinboth relations, you canomit the join conditioninouter joins In that case, you will alsoremoveone column of thesame namefrom theresulting table Example:
|
Teaching notes:
|
I personallydislikeNULLvalues.... Therefore, I willavoidusingouter joinsin my solutions.... |
More examples....
Find fname and lname of all employees working in the "Research" department that earn more than $50,000
Find fname and lname of John Smith's supervisor
Find fname and lname of all employees that have dependents
Find fname and lname of all employees thatdo nothave dependents
This solution iswrong:
The following diagram shows how the query works on an example database, which illustrateswhythe query is wrong:
The correct solution is:
Help1 = set of SSN of employees with dependents
Help2 = set of SSN of employees without any dependents
Find fname and lname of employees who has more than one dependents of the same sex (i.e., 2 or more boys or 2 or more girls):
NOTE:so join conditions do not always have to be "equal" !!!
A comment...
Clearly, I can easily formulate a gad-sillion different queries in the next day or so, so:
DO NOT
prepare for tests with
memorizing
facts - you gotta understand what's going on and do the queries "from scratch"....
And for those students that have never had me before: I have a reputation to challenge student's grey cells to work overtime.... so you should expect some very interesting queries.... and plenty of opportunity to practice your skill (in the homeworks and SQL projects).
http://www.mathcs.emory.edu/~cheung/Courses/377/Syllabus/4-RelAlg/intro.html
總結(jié)
以上是生活随笔為你收集整理的Relational Algebra 关系代数的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 微信小程序,引用扩展组件提示“没有找到可
- 下一篇: 微信小程序 筛选侧边栏 全选与反全选