"Premature optimization is the root of all evil." (D. Knuth)(But not to oversimplify, see also one or two other views.)
SELECT relpages, reltuples FROM pg_class WHERE relname = 'ab';
psql
na serveri cvika.dcs.fmph.uniba.sk
EXPLAIN SELECT name, deptno, COUNT(empno) OVER (PARTITION BY deptno) FROM emp;
EXPLAIN SELECT emp.name, dept.name, COUNT(empno) OVER (PARTITION BY emp.deptno) FROM emp JOIN dept ON emp.deptno = dept.deptno;
a skúste pochopiť, aké činnosti ide vykonať databáza. Tieto činnosti sú popísané pomocou fyzických operátorov, ktoré sú "podrobnejšie" ako operátory relačnej algebry (napr. join je možné spraviť naivným spôsobom "každý s každým" alebo ho urýchliť cez triedenie či hašovanie --- toto plán výpočtu dotazu v relačnej algebre nerozlišuje). Porovnávať plány výpočtu v relačnej algebre je možné, až keď sú jej operátory namapované na fyzické operátory.
explain analyze select * from ab;
explain analyze select * from ab where b < 4 order by b;
explain analyze select * from ab where b = 4 order by b;
create index i1 on ab (b); -- we hope that adding an index will shorten running time
explain analyze select * from ab where b < 4 order by b;
explain analyze select * from ab where b = 4 order by b;
create index i1h on ab using hash(b); -- the default index type is btree, we want to try hash index too
explain analyze select * from ab where b < 4 order by b;
explain analyze select * from ab where b = 4 order by b;
Execute the last command ten times in a row and watch how the real time consumed changes.
drop index i1;
drop index i1h;
explain select * from ab, bc; -- materialize stores the result in memory so that we can look at it more than once
explain select * from ab, bc where ab.b = bc.b;
insert into bc select x.id, x.id + 1 from generate_series(1, 1000000) as x(id);
explain select * from ab, bc where ab.b = bc.b;
explain select b, count(distinct c) from bc where not exists (select 1 from ab, cd where ab.b = bc.b and cd.c < bc.c) group by b having count(distinct c) < 3;
Beware of using analyze
: it can take a very long time.
Run psql -f explain.sql. Let's compare the plans for the queries A, B, C, D below in several different situations (they differ in the existence of indexes and the amount of data in the tables); we recommend saving the plans for individual repeated queries in separate files for easy comparison.
(A) explain select * from ab, bc where ab.b = bc.b;
(B) explain select * from ab, bc where ab.b = bc.b order by ab.b;
(C) explain select * from ab, bc where ab.b < bc.b;
(D) explain select * from ab, bc where ab.b < bc.b order by ab.b;
Now we will gradually create indexes and populate tables.
create index i1 on ab (b);
/* run all of A, B, C, D */
create index i2 on bc (b);
/* run all of A, B, C, D */
insert into bc select x.id, x.id + 1 from generate_series(1, 1000000) as x(id);
insert into bc select x.id, x.id + 1 from generate_series(1, 1000000) as x(id);
insert into bc select x.id, x.id + 1 from generate_series(1, 1000000) as x(id);
/* run all of A, B, C, D */
drop index i2;
create index i2composite on bc (b, c);
/* run all of A, B, C, D */
create index i1h on ab using hash(b);
create index i2h on bc using hash(b);
/* run all of A, B, C, D */
drop index i1;
drop index i2composite;
/* run all of A, B, C, D */
Notice how the "time" calculated by the scheduler grows if the index contains unnecessary columns.
Run psql -f explain.sql analyze plans for the following commands.
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c;
explain select * from ab, bc, cd where ab.b = bc.b and bc.c < cd.c;
explain select * from ab, bc, cd where ab.b < bc.b and bc.c < cd.c;
create index i3 on cd(c);
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c;
explain select * from ab, bc, cd where ab.b = bc.b and bc.c < cd.c;
explain select * from ab, bc, cd where ab.b < bc.b and bc.c < cd.c;
create index i4 on bc (b, c);
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c;
explain select * from ab, bc, cd where ab.b = bc.b and bc.c < cd.c;
explain select * from ab, bc, cd where ab.b < bc.b and bc.c < cd.c;
create index i3h on cd using hash(c);
create index i4h on bc using hash(c);
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c;
explain select * from ab, bc, cd where ab.b = bc.b and bc.c < cd.c;
explain select * from ab, bc, cd where ab.b < bc.b and bc.c < cd.c;
insert into ab select x.id, x.id + 1 from generate_series(1, 100000) as x(id);
insert into bc select x.id, x.id + 1 from generate_series(1, 100000) as x(id);
insert into cd select x.id, x.id + 1 from generate_series(1, 100000) as x(id);
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c;
explain select * from ab, bc, cd where ab.b = bc.b and bc.c = cd.c order by cd.c;
explain select cd.c, count(*) from ab, bc, cd where ab.b = bc.b and bc.c = cd.c group by cd.c;
\du \from tablename SELECT grantee, privilege_type FROM information_schema.role_table_grants WHERE table_name='test';
CREATE TABLE test_name (i INTEGER, t TEXT);and add a few lines to it:
INSERT INTO test_name VALUES (1, 'a'); INSERT INTO test_name VALUES (2, 'b');
[city, number_of_days, number_of_places]
, where number_of_places
determines the number of places that can be reached from the city of city
in number_of_days
days.psql -f world.sql
sqlite3 --init emp.sql emp.db
sqlitebrowser emp.db &or by
sqlite3 emp.db
SELECT * FROM emp;
sqlite3 emp.db < queries.sql(to run queries, you also use Execute SQL tab in the sqlitebrowseri.)
For each of the 3 homework assignments, there is a max. 30 points.
Additional points are obtained for solving exersises from lessons: 0 or 1 point for each of the 12 lessons. From the points awarded for lesson exercises, it is necessary to obtain at least 9 to successfully complete the course.
Within 3 days after each lesson, solutions must be sent by e-mail to the address rjasko (at) dcs.fmph.uniba.sk. Subsequently, the solutions will be briefly evaluated: in case of sufficiency, 1 point will be awarded, in case of insufficiency, the student will be asked to supplement the solution (if he submitted at least something meaningful), for which he has another 3 days, and then he will receive a final evaluation.
Not all exercises need to be solved; it is neccessary to have solved (at least somehow correctly) about 40% of the exercises from each topic. If the set of exercises is more extensive, they are usually at least approximately in order of difficulty; if you want to solve a minimum of exercises, choose some more demanding ones from the second half. The recommended procedure is to go in order and skip exercises where it is clear that you already know how to solve them (eg they are similar to already solved exercises and you can see what the difference is and how to deal with it).