项目作者: motapinto

项目描述 :
SQL: Query optimization
高级语言:
项目地址: git://github.com/motapinto/sql-query-optimization.git
创建时间: 2021-02-23T15:15:25Z
项目社区:https://github.com/motapinto/sql-query-optimization

开源协议:

下载


Teaching Service SQL: Query optimization

Table of Contents

  • Introduction
  • Setting Environment Y
    • Primary keys
    • Foreign keys
  • Setting Environment Z
    • Primary keys
    • Foreign keys
    • Indexes
  • Queries
    • Query 1
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 2
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 3.1
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 3.2
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 4
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 5
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
      • Conclusion
    • Query 6
      • SQL Formulation
      • Result
      • Execution Plan
        • Environment X
        • Environment Y
        • Environment Z
    • Conclusion

Introduction

The project consists in the analysis of execution plans for different SQL queries and the evaluation of having different indexes and their impact on the perfomance of those queries.

For that, we created 3 different environemnts(X, Y and Z).

  • Environent X consists of a basic environemnt without any constraints or restrictions.
  • Environment Y consists of the same settings present in environment X but includes primary and foreign keys.
  • Environment Z consists of the same settings present in environment Y but also include indexes.

NOTE: You can populate the database by using the scripts in populate/

Setting Environment Y

Shortcut: To the top

Primary keys

  1. ALTER TABLE "YDOCENTES"
  2. ADD CONSTRAINT YDOCENTES_PKEY
  3. PRIMARY KEY("NR");
  4. ALTER TABLE "YDSD"
  5. ADD CONSTRAINT YDSD_PKEY
  6. PRIMARY KEY("NR", "ID");
  7. ALTER TABLE "YTIPOSAULA"
  8. ADD CONSTRAINT YTIPOSAULA_PKEY
  9. PRIMARY KEY("ID");
  10. ALTER TABLE "YOCORRENCIAS"
  11. ADD CONSTRAINT YOCORRENCIAS_PKEY
  12. PRIMARY KEY("CODIGO", "ANO_LETIVO", "PERIODO");
  13. ALTER TABLE "YUCS"
  14. ADD CONSTRAINT YUCS_PKEY
  15. PRIMARY KEY("CODIGO");

Foreign keys

  1. ALTER TABLE "YDSD"
  2. ADD CONSTRAINT YDSD_FKEY_YDOCENTES
  3. FOREIGN KEY("NR")
  4. REFERENCES "YDOCENTES"("NR");
  5. ALTER TABLE "YDSD"
  6. ADD CONSTRAINT YDSD_FKEY_YTIPOSAULA
  7. FOREIGN KEY("ID")
  8. REFERENCES "YTIPOSAULA"("ID");
  9. ALTER TABLE "YOCORRENCIAS"
  10. ADD CONSTRAINT YOCORRENCIAS_FKEY_YUCS
  11. FOREIGN KEY("CODIGO")
  12. REFERENCES "YUCS"("CODIGO");
  13. ALTER TABLE "YTIPOSAULA"
  14. ADD CONSTRAINT YTIPOSAULA_FKEY_YOCORRENCIAS
  15. FOREIGN KEY("CODIGO", "ANO_LETIVO", "PERIODO")
  16. REFERENCES "YOCORRENCIAS"("CODIGO", "ANO_LETIVO", "PERIODO");

Setting Environment Z

Shortcut: To the top

Primary keys

  1. ALTER TABLE "ZDOCENTES"
  2. ADD CONSTRAINT ZDOCENTES_PKEY
  3. PRIMARY KEY("NR");
  4. ALTER TABLE "ZDSD"
  5. ADD CONSTRAINT ZDSD_PKEY
  6. PRIMARY KEY("NR", "ID");
  7. ALTER TABLE "ZTIPOSAULA"
  8. ADD CONSTRAINT ZTIPOSAULA_PKEY
  9. PRIMARY KEY("ID");
  10. ALTER TABLE "ZOCORRENCIAS"
  11. ADD CONSTRAINT ZOCORRENCIAS_PKEY
  12. PRIMARY KEY("CODIGO", "ANO_LETIVO", "PERIODO");
  13. ALTER TABLE "ZUCS"
  14. ADD CONSTRAINT ZUCS_PKEY
  15. PRIMARY KEY("CODIGO");

Foreign keys

  1. ALTER TABLE "ZDSD"
  2. ADD CONSTRAINT ZDSD_FKEY_ZDOCENTES
  3. FOREIGN KEY("NR")
  4. REFERENCES "ZDOCENTES"("NR");
  5. ALTER TABLE "ZDSD"
  6. ADD CONSTRAINT ZDSD_FKEY_ZTIPOSAULA
  7. FOREIGN KEY("ID")
  8. REFERENCES "ZTIPOSAULA"("ID");
  9. ALTER TABLE "ZOCORRENCIAS"
  10. ADD CONSTRAINT ZOCORRENCIAS_FKEY_ZUCS
  11. FOREIGN KEY("CODIGO")
  12. REFERENCES "ZUCS"("CODIGO");
  13. ALTER TABLE "ZTIPOSAULA"
  14. ADD CONSTRAINT ZTIPOSAULA_FKEY_ZOCORRENCIAS
  15. FOREIGN KEY("CODIGO", "ANO_LETIVO", "PERIODO")
  16. REFERENCES "ZOCORRENCIAS"("CODIGO", "ANO_LETIVO", "PERIODO");

Indexes

One of the things that can most improve query times is to add indexes to foreign keys, when this is not done automatically. Furthermore, columns that are used in complex joins, in where conditions should also be considered.

To determine the most appropriate index type we had in consideration the types of queries. Since all of them are SELECT conditions, it obviously had a considerable weight in our decisions.

  1. CREATE INDEX IX_ZUCS_DESIGNACAO ON ZUCS (designacao);
  2. CREATE INDEX IX_ZUCS_CODIGO_CURSO ON ZUCS(codigo, curso);
  3. CREATE INDEX IX_ZUCS_CURSO ON ZUCS (curso DESC);
  4. CREATE INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ano_letivo DESC);
  5. CREATE INDEX IX_ZTIPOSAULA_CODIGO ON ZTIPOSAULA(codigo);
  6. CREATE INDEX IX_ZTIPOSAULA_TIPO ON ZTIPOSAULA (tipo);
  7. CREATE INDEX IX_ZDSD_ID ON ZDSD(id);

Note: All indexes will be justified for each query. Here is just a demonstration of all indexes.

Queries

Query 1

Shortcut: To the top

Show the codigo, designacao, ano_letivo, inscritos, tipo, and turnos for the course ‘Bases de Dados’ of the program 275.

SQL Formulation

  1. SELECT codigo, designacao, ano_letivo, inscritos, tipo, turnos
  2. FROM XUCS UCS
  3. JOIN XOCORRENCIAS OCCURENCES USING (codigo)
  4. JOIN XTIPOSAULA CLASSTYPES USING (codigo, periodo, ano_letivo)
  5. WHERE UCS.curso = 275 AND UCS.designacao = 'Bases de Dados'

Result (in JSON format)

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "CODIGO",
  7. "type" : "VARCHAR2"
  8. },
  9. {
  10. "name" : "ANO_LETIVO",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "DESIGNACAO",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "INSCRITOS",
  19. "type" : "NUMBER"
  20. },
  21. {
  22. "name" : "TIPO",
  23. "type" : "VARCHAR2"
  24. },
  25. {
  26. "name" : "TURNOS",
  27. "type" : "NUMBER"
  28. }
  29. ],
  30. "items" : [
  31. {
  32. "codigo" : "EIC3106",
  33. "ano_letivo" : "2003/2004",
  34. "designacao" : "Bases de Dados",
  35. "inscritos" : 92,
  36. "tipo" : "T",
  37. "turnos" : 1
  38. },
  39. {
  40. "codigo" : "EIC3106",
  41. "ano_letivo" : "2003/2004",
  42. "designacao" : "Bases de Dados",
  43. "inscritos" : 92,
  44. "tipo" : "TP",
  45. "turnos" : 4
  46. },
  47. {
  48. "codigo" : "EIC3106",
  49. "ano_letivo" : "2004/2005",
  50. "designacao" : "Bases de Dados",
  51. "inscritos" : 114,
  52. "tipo" : "T",
  53. "turnos" : 1
  54. },
  55. {
  56. "codigo" : "EIC3106",
  57. "ano_letivo" : "2004/2005",
  58. "designacao" : "Bases de Dados",
  59. "inscritos" : 114,
  60. "tipo" : "TP",
  61. "turnos" : 4
  62. },
  63. {
  64. "codigo" : "EIC3111",
  65. "ano_letivo" : "2005/2006",
  66. "designacao" : "Bases de Dados",
  67. "inscritos" : "",
  68. "tipo" : "T",
  69. "turnos" : 1
  70. },
  71. {
  72. "codigo" : "EIC3111",
  73. "ano_letivo" : "2005/2006",
  74. "designacao" : "Bases de Dados",
  75. "inscritos" : "",
  76. "tipo" : "TP",
  77. "turnos" : 6
  78. }
  79. ]
  80. }
  81. ]
  82. }

Execution Plan

Environment X Environment Y Environment Z
Cost 642 55 14
Environment X


-

Environment Y

Since in this query there are two expensive join operations and a where clause, we can see that using primary and foreign keys have a great impact on the cost of the query.


-

Environment Z

This are the indexes used to improve the cost of the query. It was also used the constraints defined from the previous environment.

  1. CREATE INDEX IX_ZUCS_DESIGNACAO ON ZUCS (designacao);
  2. CREATE INDEX IX_ZTIPOSAULA_CODIGO ON ZTIPOSAULA(codigo);
  3. CREATE INDEX IX_ZUCS_CODIGO_CURSO ON ZUCS(codigo, curso);

The first and the second one are useful since the curso and designacao columns from ZUCS are used in the WHERE condition to get a UC from a specific degree with a specific designation.

The third one is also very useful because it reduces the cost of the hash joins operations. It uses a composite index with the columns codigo and curso.


Query 2

Shortcut: To the top

How many class hours of each type did the program 233 planned in year 2004/2005?

SQL Formulation

  1. SELECT DISTINCT UCS.curso, CLASS_TYPES.ano_letivo, CLASS_TYPES.tipo,
  2. SUM(CLASS_TYPES.turnos * CLASS_TYPES.horas_turno) AS TOTAL_HOURS
  3. FROM XTIPOSAULA CLASS_TYPES
  4. INNER JOIN XUCS UCS ON CLASS_TYPES.codigo = UCS.codigo
  5. WHERE CLASS_TYPES.ano_letivo='2004/2005'
  6. AND UCS.curso = '233'
  7. GROUP BY UCS.curso, CLASS_TYPES.ano_letivo, CLASS_TYPES.tipo;

Result (in JSON format)

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "CURSO",
  7. "type" : "NUMBER"
  8. },
  9. {
  10. "name" : "ANO_LETIVO",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "TIPO",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "TOTAL_HOURS",
  19. "type" : "NUMBER"
  20. }
  21. ],
  22. "items" : [
  23. {
  24. "curso" : 233,
  25. "ano_letivo" : "2004/2005",
  26. "tipo" : "TP",
  27. "total_hours" : 697.5
  28. },
  29. {
  30. "curso" : 233,
  31. "ano_letivo" : "2004/2005",
  32. "tipo" : "P",
  33. "total_hours" : 581.5
  34. },
  35. {
  36. "curso" : 233,
  37. "ano_letivo" : "2004/2005",
  38. "tipo" : "T",
  39. "total_hours" : 308
  40. }
  41. ]
  42. }
  43. ]
  44. }

Execution Plan

Environment X Environment Y Environment Z
Cost 50 50 10
Environment X

Environment Y

Compared to environment X, primary and foreign key constraints didn’t have any impact on the cost of the query.

Environment Z

These are the indexes used to improve the cost of the query. Additionally, the constraints defined from the previous environment were also used.

  1. CREATE INDEX IX_ZUCS_CURSO ON ZUCS (curso DESC);
  2. CREATE INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ano_letivo DESC);

Both indexes are useful since the curso column from ZUCS and ano_letivo column from ZTIPOSAULA are used in the WHERE condition to retrieve a specific program from a specific year.


Query 3.1

Shortcut: To the top

Which courses did have occurrences planned but did not get service assigned in year 2003/2004?*[Using NOT IN]*

SQL Formulation

  1. SELECT DISTINCT UCS.codigo
  2. FROM XOCORRENCIAS OCCURENCES
  3. INNER JOIN XUCS UCS ON OCCURENCES.codigo = UCS.codigo
  4. WHERE OCCURENCES.ano_letivo = '2003/2004'
  5. AND UCS.codigo NOT IN (
  6. SELECT DISTINCT codigo
  7. FROM XTIPOSAULA CLASS_TYPES
  8. INNER JOIN XDSD T_DISTRIBUTION ON CLASS_TYPES.id = T_DISTRIBUTION.id
  9. WHERE CLASS_TYPES.ano_letivo = '2003/2004'
  10. )

There was also a try to have a materialized view for a portion of the query using the following DDL statement

  1. CREATE MATERIALIZED VIEW
  2. PLANNED_UCS AS
  3. SELECT DISTINCT codigo
  4. FROM XTIPOSAULA CLASS_TYPES
  5. INNER JOIN XDSD T_DISTRIBUTION ON CLASS_TYPES.id = T_DISTRIBUTION.id
  6. WHERE CLASS_TYPES.ano_letivo = '2003/2004';
  1. SELECT DISTINCT UCS.codigo
  2. FROM XOCORRENCIAS OCCURENCES
  3. INNER JOIN XUCS UCS ON OCCURENCES.codigo = UCS.codigo
  4. WHERE OCCURENCES.ano_letivo = '2003/2004'
  5. AND NOT EXISTS (
  6. SELECT codigo
  7. FROM PLANNED_UCS
  8. WHERE codigo = OCCURENCES.codigo
  9. )

Result (in JSON format)

  • 138 row
    1. {
    2. "results" : [
    3. {
    4. "columns" : [
    5. {
    6. "name" : "CODIGO",
    7. "type" : "VARCHAR2"
    8. }
    9. ],
    10. "items" : [
    11. {
    12. "codigo" : "MEMT1000"
    13. },
    14. {
    15. "codigo" : "MEMT100"
    16. },
    17. {
    18. "codigo" : "EQ418"
    19. },
    20. {
    21. "codigo" : "MTM108"
    22. },
    23. {
    24. "codigo" : "MEMT131"
    25. },
    26. {
    27. "codigo" : "MEEC1053"
    28. },
    29. {
    30. "codigo" : "MEM157"
    31. },
    32. {
    33. "codigo" : "MEM181"
    34. },
    35. {
    36. "codigo" : "MDI1205"
    37. },
    38. {
    39. "codigo" : "MPFCA103"
    40. },
    41. {
    42. "codigo" : "MPFCA204"
    43. },
    44. {
    45. "codigo" : "EIC4220"
    46. },
    47. {
    48. "codigo" : "EIC4221"
    49. },
    50. {
    51. "codigo" : "EIC4222"
    52. },
    53. {
    54. "codigo" : "CI027"
    55. },
    56. {
    57. "codigo" : "MEMT107"
    58. },
    59. {
    60. "codigo" : "MEMT102"
    61. },
    62. {
    63. "codigo" : "MEAM1310"
    64. },
    65. {
    66. "codigo" : "MPPAU2215"
    67. },
    68. {
    69. "codigo" : "MEM187"
    70. },
    71. {
    72. "codigo" : "MEM189"
    73. },
    74. {
    75. "codigo" : "MEA219"
    76. },
    77. {
    78. "codigo" : "EI1107"
    79. },
    80. {
    81. "codigo" : "MPFCA106"
    82. },
    83. {
    84. "codigo" : "EIC4225"
    85. },
    86. {
    87. "codigo" : "CI014"
    88. },
    89. {
    90. "codigo" : "CI018"
    91. },
    92. {
    93. "codigo" : "CI007"
    94. },
    95. {
    96. "codigo" : "CI017"
    97. },
    98. {
    99. "codigo" : "CI008"
    100. },
    101. {
    102. "codigo" : "MEA412"
    103. },
    104. {
    105. "codigo" : "MTM111"
    106. },
    107. {
    108. "codigo" : "MDI1105"
    109. },
    110. {
    111. "codigo" : "MDI1103"
    112. },
    113. {
    114. "codigo" : "MEMT2000"
    115. },
    116. {
    117. "codigo" : "MEAM1312"
    118. },
    119. {
    120. "codigo" : "MEMT135"
    121. },
    122. {
    123. "codigo" : "MPPAU1113"
    124. },
    125. {
    126. "codigo" : "EIC3209"
    127. },
    128. {
    129. "codigo" : "MEM179"
    130. },
    131. {
    132. "codigo" : "MEA215"
    133. },
    134. {
    135. "codigo" : "MEA414"
    136. },
    137. {
    138. "codigo" : "MDI1107"
    139. },
    140. {
    141. "codigo" : "MDI1208"
    142. },
    143. {
    144. "codigo" : "MDI1108"
    145. },
    146. {
    147. "codigo" : "MPPAU2217"
    148. },
    149. {
    150. "codigo" : "MPFCA101"
    151. },
    152. {
    153. "codigo" : "MPFCA205"
    154. },
    155. {
    156. "codigo" : "EIC5127"
    157. },
    158. {
    159. "codigo" : "MTM115"
    160. },
    161. {
    162. "codigo" : "EMM528"
    163. },
    164. {
    165. "codigo" : "MTM110"
    166. },
    167. {
    168. "codigo" : "MEAM5000"
    169. },
    170. {
    171. "codigo" : "EC5280"
    172. },
    173. {
    174. "codigo" : "MPFCA100"
    175. },
    176. {
    177. "codigo" : "MPFCA104"
    178. },
    179. {
    180. "codigo" : "MPFCA200"
    181. },
    182. {
    183. "codigo" : "EC5200"
    184. },
    185. {
    186. "codigo" : "EEC5022"
    187. },
    188. {
    189. "codigo" : "EIC5124"
    190. },
    191. {
    192. "codigo" : "CI020"
    193. },
    194. {
    195. "codigo" : "CI016"
    196. },
    197. {
    198. "codigo" : "CI011"
    199. },
    200. {
    201. "codigo" : "MTM114"
    202. },
    203. {
    204. "codigo" : "MPPAU1114"
    205. },
    206. {
    207. "codigo" : "MEM180"
    208. },
    209. {
    210. "codigo" : "MVC1211"
    211. },
    212. {
    213. "codigo" : "MEA112"
    214. },
    215. {
    216. "codigo" : "MEA217"
    217. },
    218. {
    219. "codigo" : "MEA320"
    220. },
    221. {
    222. "codigo" : "MEMT106"
    223. },
    224. {
    225. "codigo" : "EC5287"
    226. },
    227. {
    228. "codigo" : "MDI1106"
    229. },
    230. {
    231. "codigo" : "MPPAU2219"
    232. },
    233. {
    234. "codigo" : "MPFCA105"
    235. },
    236. {
    237. "codigo" : "MPFCA107"
    238. },
    239. {
    240. "codigo" : "MPFCA201"
    241. },
    242. {
    243. "codigo" : "MPFCA202"
    244. },
    245. {
    246. "codigo" : "MPFCA206"
    247. },
    248. {
    249. "codigo" : "EIC5125"
    250. },
    251. {
    252. "codigo" : "EIC5126"
    253. },
    254. {
    255. "codigo" : "CI038"
    256. },
    257. {
    258. "codigo" : "MEB205"
    259. },
    260. {
    261. "codigo" : "EQ407"
    262. },
    263. {
    264. "codigo" : "MDI1204"
    265. },
    266. {
    267. "codigo" : "MDI1100"
    268. },
    269. {
    270. "codigo" : "MFAMF1108"
    271. },
    272. {
    273. "codigo" : "MPPAU2220"
    274. },
    275. {
    276. "codigo" : "MPPAU2216"
    277. },
    278. {
    279. "codigo" : "MEM163"
    280. },
    281. {
    282. "codigo" : "MEM175"
    283. },
    284. {
    285. "codigo" : "MEM184"
    286. },
    287. {
    288. "codigo" : "MEM188"
    289. },
    290. {
    291. "codigo" : "MEM191"
    292. },
    293. {
    294. "codigo" : "MEA415"
    295. },
    296. {
    297. "codigo" : "EIC4223"
    298. },
    299. {
    300. "codigo" : "EIC5122"
    301. },
    302. {
    303. "codigo" : "EIC5123"
    304. },
    305. {
    306. "codigo" : "CI023"
    307. },
    308. {
    309. "codigo" : "CI009"
    310. },
    311. {
    312. "codigo" : "MEM1205"
    313. },
    314. {
    315. "codigo" : "GEI512"
    316. },
    317. {
    318. "codigo" : "MEMT105"
    319. },
    320. {
    321. "codigo" : "MTM104"
    322. },
    323. {
    324. "codigo" : "MEAM1314"
    325. },
    326. {
    327. "codigo" : "EQ411"
    328. },
    329. {
    330. "codigo" : "MDI1207"
    331. },
    332. {
    333. "codigo" : "MDI1209"
    334. },
    335. {
    336. "codigo" : "MEB204"
    337. },
    338. {
    339. "codigo" : "MMCCE1220"
    340. },
    341. {
    342. "codigo" : "EEC2207"
    343. },
    344. {
    345. "codigo" : "EIC4224"
    346. },
    347. {
    348. "codigo" : "EIC5129"
    349. },
    350. {
    351. "codigo" : "CI019"
    352. },
    353. {
    354. "codigo" : "CI002"
    355. },
    356. {
    357. "codigo" : "CI025"
    358. },
    359. {
    360. "codigo" : "CI037"
    361. },
    362. {
    363. "codigo" : "MEB105"
    364. },
    365. {
    366. "codigo" : "EQ308"
    367. },
    368. {
    369. "codigo" : "MPPAU2218"
    370. },
    371. {
    372. "codigo" : "MPPAU1112"
    373. },
    374. {
    375. "codigo" : "EEC5272"
    376. },
    377. {
    378. "codigo" : "MEM5000"
    379. },
    380. {
    381. "codigo" : "MEM158"
    382. },
    383. {
    384. "codigo" : "MEM182"
    385. },
    386. {
    387. "codigo" : "MEM183"
    388. },
    389. {
    390. "codigo" : "MEA216"
    391. },
    392. {
    393. "codigo" : "MEA319"
    394. },
    395. {
    396. "codigo" : "MEST210"
    397. },
    398. {
    399. "codigo" : "MEMT110"
    400. },
    401. {
    402. "codigo" : "MDI1206"
    403. },
    404. {
    405. "codigo" : "MEMT120"
    406. },
    407. {
    408. "codigo" : "MPPAU1115"
    409. },
    410. {
    411. "codigo" : "MPFCA102"
    412. },
    413. {
    414. "codigo" : "MPFCA203"
    415. },
    416. {
    417. "codigo" : "CI003"
    418. },
    419. {
    420. "codigo" : "CI004"
    421. },
    422. {
    423. "codigo" : "CI013"
    424. }
    425. ]
    426. }
    427. ]
    428. }

    Execution Plan

    | | Environment X | Environment Y | Environment Z |
    | ————- | ——————- | ——————- | ——————- |
    | Cost without Mat. View | 670 | 86 | 51 |
    | Cost with Mat. View | 610 | 31 | 31 |
Environment X

Without the materialized view:

With the materialized view:

Environment Y

Compared to environment X, primary and foreign key constraints had a great impact on the query performance. This can be seen in the query plan where it’s used a fast full scan twice in the query.

Without the materialized view:

With the materialized view:

Environment Z

This are the indexes used to improve the cost of the query. It was also used the constraints defined from the previous environment.

  1. CREATE INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ano_letivo DESC);
  2. CREATE INDEX IX_ZDSD_ID ON ZDSD(id);

The first one is useful since the ano_letivo column from ZTIPOSAULA is used in the WHERE condition to determine the planned ucs. We could also opt to create the index on all attribute from the foreign key but that would produce the same effect.

The second one is also useful because the primary key on ZDSD is (nr, id) that creates a composite index on both attributes. The problem is that the composite index will work on statements that use only NR or both the nr and id columns. Since the id is used for the JOIN operation, that index also as an impact on the query cost.

Without the materialized view:

With the materialized view:


Query 3.2

Shortcut: To the top

Which courses did have occurrences planned but did not get service assigned in year 2003/2004? [Using external join and is null]

SQL Formulation

  1. SELECT DISTINCT UCS.codigo
  2. FROM XOCORRENCIAS OCCURENCES
  3. INNER JOIN XUCS UCS ON OCCURENCES.codigo = UCS.codigo
  4. LEFT OUTER JOIN (
  5. SELECT DISTINCT codigo
  6. FROM XTIPOSAULA CLASS_TYPES
  7. INNER JOIN XDSD T_DISTRIBUTION ON CLASS_TYPES.id = T_DISTRIBUTION.id
  8. WHERE CLASS_TYPES.ano_letivo = '2003/2004'
  9. ) PLANNED_UCS ON PLANNED_UCS.codigo = UCS.codigo
  10. WHERE PLANNED_UCS.CODIGO IS NULL
  11. AND ANO_LETIVO = '2003/2004'

Result (in JSON format)

  • The same result as in Query 3.1

    Execution Plan

    | | Environment X | Environment Y | Environment Z |
    | ————- | ——————- | ——————- | ——————- |
    | Cost | 671 | 87 | 52 |
Environment X

Environment Y

Compared to environment X, primary and foreign key constraints didn’t have any impact on the cost of the query since the access done was a full access.

Environment Z


Query 4

Shortcut: To the top

Who is the professor with more class hours for each type of class, in the academic year 2003/2004?
Show the number and name of the professor, the type of class and the total of class hours times the factor.

SQL Formulation

  1. SELECT nr, nome, tipo, max_hours
  2. FROM (
  3. SELECT MAX(nr) AS nr, tipo, MAX(total_hours) AS max_hours
  4. FROM (
  5. SELECT nr, tipo, SUM(horas * fator) AS total_hours
  6. FROM XTIPOSAULA
  7. NATURAL JOIN XDSD
  8. WHERE ano_letivo = '2003/2004'
  9. GROUP BY nr, tipo
  10. )
  11. GROUP BY TIPO
  12. ) TMP
  13. NATURAL JOIN XDOCENTES

Result (in JSON format)

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "NR",
  7. "type" : "NUMBER"
  8. },
  9. {
  10. "name" : "NOME",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "TIPO",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "MAX_HOURS",
  19. "type" : "NUMBER"
  20. }
  21. ],
  22. "items" : [
  23. {
  24. "nr" : 246626,
  25. "nome" : "Jorge Manuel Gomes Barbosa",
  26. "tipo" : "OT",
  27. "max_hours" : 3.5
  28. },
  29. {
  30. "nr" : 908100,
  31. "nome" : "Armínio de Almeida Teixeira",
  32. "tipo" : "P",
  33. "max_hours" : 30
  34. },
  35. {
  36. "nr" : 908290,
  37. "nome" : "José Manuel Miguez Araújo",
  38. "tipo" : "TP",
  39. "max_hours" : 26
  40. },
  41. {
  42. "nr" : 909330,
  43. "nome" : "Nuno Filipe da Cunha Nogueira",
  44. "tipo" : "T",
  45. "max_hours" : 30.67
  46. }
  47. ]
  48. }
  49. ]
  50. }

Execution Plan

Environment X Environment Y Environment Z
Cost 69 69 38
Environment X

Environment Y

Compared to environment X, primary and foreign key constraints didn’t have any impact on the cost of the query.

Environment Z

Only one index was used to improve the cost of the query. Additionally, the constraints defined from the previous environment were also used.

  1. CREATE INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ano_letivo DESC);

This index is useful since the ano_letivo column from ZTIPOSAULA is used in the WHERE condition to retrieve a specific year.


Query 5

Shortcut: To the top

Compare the execution plans (just the environment Z) and the index sizes for the query giving the course code, the academic year, the period, and number of hours of the type ‘OT’ in the academic years of 2002/2003 and 2003/2004.

  • a) With a B-tree index on the type and academic year columns of the
    ZTIPOSAULA table;
  • b) With a bitmap index on the type and academic year columns of the
    ZTIPOSAULA table;

B-tree index

  • High cardinality
    1. CREATE INDEX IX_ZTIPOSAULA_TIPO ON ZTIPOSAULA (TIPO);
    2. CREATE INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ANO_LETIVO DESC);

Bitmap index

  • Low cardinality
  1. CREATE BITMAP INDEX IX_ZTIPOSAULA_TIPO ON ZTIPOSAULA (TIPO);
  2. CREATE BITMAP INDEX IX_ZTIPOSAULA_ANOLETIVO ON ZTIPOSAULA (ANO_LETIVO DESC);

SQL Formulation

  1. SELECT codigo, ano_letivo, periodo, horas_turno * turnos as NumberOfHours
  2. FROM ZTIPOSAULA
  3. WHERE (ano_letivo = '2002/2003' OR ano_letivo = '2003/2004') AND (tipo = 'OT')

Result (in JSON format)

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "CODIGO",
  7. "type" : "VARCHAR2"
  8. },
  9. {
  10. "name" : "ANO_LETIVO",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "PERIODO",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "NUMBEROFHOURS",
  19. "type" : "NUMBER"
  20. }
  21. ],
  22. "items" : [
  23. {
  24. "codigo" : "EIC5202",
  25. "ano_letivo" : "2002/2003",
  26. "periodo" : "2S",
  27. "numberofhours" : 27
  28. },
  29. {
  30. "codigo" : "EIC5202",
  31. "ano_letivo" : "2003/2004",
  32. "periodo" : "2S",
  33. "numberofhours" : 24
  34. }
  35. ]
  36. }
  37. ]
  38. }

Execution Plan

Environment Z a) Environment Z b)
Cost 5 8
Environment Z a)

In Oracle, full table scans have less cost than index range scans in cases when accessing a large fraction of the blocks in a table. The reason is that full table scans can use larger I/O calls what is cheaper than making many smaller calls.

But, Oracle optimizer uses a range scan when it finds one or more leading columns of an index specified in condition as it is in this task.

Environment Z b)

The way of accessing the attributes is similar to the one on the execution with the B-tree index, using an index access and a table access by index rowid, however, the query has a higher cost when executed with a bitmap index.

Index sizes

  • Query to get index sizes:
    1. SELECT index_name, index_type, table_name, uniqueness, blevel, leaf_blocks, distinct_keys, num_rows
    2. FROM user_indexes;

Result (in JSON format) - B-tree indexes

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "INDEX_NAME",
  7. "type" : "VARCHAR2"
  8. },
  9. {
  10. "name" : "INDEX_TYPE",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "TABLE_NAME",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "UNIQUENESS",
  19. "type" : "VARCHAR2"
  20. },
  21. {
  22. "name" : "BLEVEL",
  23. "type" : "NUMBER"
  24. },
  25. {
  26. "name" : "LEAF_BLOCKS",
  27. "type" : "NUMBER"
  28. },
  29. {
  30. "name" : "DISTINCT_KEYS",
  31. "type" : "NUMBER"
  32. },
  33. {
  34. "name" : "NUM_ROWS",
  35. "type" : "NUMBER"
  36. }
  37. ],
  38. "items" : [
  39. {
  40. "index_name" : "IX_ZTIPOSAULA_TIPO",
  41. "index_type" : "NORMAL",
  42. "table_name" : "ZTIPOSAULA",
  43. "uniqueness" : "NONUNIQUE",
  44. "blevel" : 1,
  45. "leaf_blocks" : 39,
  46. "distinct_keys" : 5,
  47. "num_rows" : 21019
  48. },
  49. {
  50. "index_name" : "IX_ZTIPOSAULA_ANOLETIVO",
  51. "index_type" : "FUNCTION-BASED NORMAL",
  52. "table_name" : "ZTIPOSAULA",
  53. "uniqueness" : "NONUNIQUE",
  54. "blevel" : 1,
  55. "leaf_blocks" : 65,
  56. "distinct_keys" : 19,
  57. "num_rows" : 21019
  58. }
  59. ]
  60. }
  61. ]
  62. }

Result (in JSON format) - Bitmap indexes

Under is showed part of JSON regarding indexes from task 5.

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "INDEX_NAME",
  7. "type" : "VARCHAR2"
  8. },
  9. {
  10. "name" : "INDEX_TYPE",
  11. "type" : "VARCHAR2"
  12. },
  13. {
  14. "name" : "TABLE_NAME",
  15. "type" : "VARCHAR2"
  16. },
  17. {
  18. "name" : "UNIQUENESS",
  19. "type" : "VARCHAR2"
  20. },
  21. {
  22. "name" : "BLEVEL",
  23. "type" : "NUMBER"
  24. },
  25. {
  26. "name" : "LEAF_BLOCKS",
  27. "type" : "NUMBER"
  28. },
  29. {
  30. "name" : "DISTINCT_KEYS",
  31. "type" : "NUMBER"
  32. },
  33. {
  34. "name" : "NUM_ROWS",
  35. "type" : "NUMBER"
  36. }
  37. ],
  38. "items" : [
  39. {
  40. "index_name" : "IX_ZTIPOSAULA_ANOLETIVO",
  41. "index_type" : "BITMAP",
  42. "table_name" : "ZTIPOSAULA",
  43. "uniqueness" : "NONUNIQUE",
  44. "blevel" : 0,
  45. "leaf_blocks" : 1,
  46. "distinct_keys" : 19,
  47. "num_rows" : 19
  48. },
  49. {
  50. "index_name" : "IX_ZTIPOSAULA_TIPO",
  51. "index_type" : "BITMAP",
  52. "table_name" : "ZTIPOSAULA",
  53. "uniqueness" : "NONUNIQUE",
  54. "blevel" : 1,
  55. "leaf_blocks" : 2,
  56. "distinct_keys" : 5,
  57. "num_rows" : 5
  58. }
  59. ]
  60. }
  61. ]
  62. }

Two columns that are important to look at are: “leaf_blocks” as the number of blocks used by the index at the lowest and largest level, and “num_rows” as the number of rows that are indexed.

Looking at the given JSON results above for both indexes we can compare “leaf_blocks” values. If the “leaf_blocks” values is lager that means index consumes more space. It is visible that bitmap indexes need drastically less space than their B-tree equivalents. Also, with B-tree index we have one entry in the “leaf_blocks” for each row in the table but with Bitmap index no.

If we take a look at the “num_rows” we can see that bitmap indexes also have significantly smaller value than b-tree indexes. The reason is because bitmap indexes store a single key value that points to many rows and because of that, there will be significantly less key values (num_rows) in a bitmap index than in the table it points to.


Query 6

Shortcut: To the top

Select the programs (curso) that have classes with all the existing types.

SQL Formulation

  1. SELECT DISTINCT UCS.curso AS programm
  2. FROM XUCS UCS JOIN XTIPOSAULA CLASS_TYPES ON UCS.codigo = CLASS_TYPES.codigo
  3. GROUP BY UCS.curso
  4. HAVING COUNT(DISTINCT CLASS_TYPES.tipo) = (
  5. SELECT COUNT(DISTINCT tipo)
  6. FROM XTIPOSAULA
  7. )

Result (in JSON format)

  1. {
  2. "results" : [
  3. {
  4. "columns" : [
  5. {
  6. "name" : "PROGRAMM",
  7. "type" : "NUMBER"
  8. }
  9. ],
  10. "items" : [
  11. {
  12. "programm" : 9461
  13. },
  14. {
  15. "programm" : 4495
  16. },
  17. {
  18. "programm" : 9508
  19. },
  20. {
  21. "programm" : 2021
  22. }
  23. ]
  24. }
  25. ]
  26. }

Execution Plan

Environment X Environment Y Environment Z
Cost 51 51 45
Environment X

Environment Y

In this query we can see that using primary and foreign keys didn’t change the cost of the query, since the optimizer decided to do a full access to all tables, instead of a fast full access.

Environment Z

This are the indexes used to improve the cost of the query. It was also used the constraints defined from the previous environment.

  1. CREATE INDEX IX_ZTIPOSAULA_TIPO ON ZTIPOSAULA (tipo);
  2. CREATE INDEX IX_ZUCS_CODIGO_CURSO ON ZUCS(codigo, curso);

The first one is useful since the tipo column from ZTIPOSAULA is used in the HAVING clause to determine if a course has all types of classes.
This makes the HAVING statement have a cost of 13 instead of 37. In spite of this, the total cost of the query won’t change with this index.

The second one is more useful because it reduces the cost of the hash join between ZUCS and ZTIPOSAULA. It uses a composite index with the columns codigo and curso.


Conclusion

Shortcut: To the top

We learned that the first thing to do when trying to optimize queries is to add the primary and foreign keys constraints since most queries this is where the cost was substantially lower. We also learned about query optimization, where different formulations can lead to very distinct results, such as doing unnecessary joins, complex operations, etc.

Moreover, the B-tree indexes reduced even further the cost of most queries. Our motto for adding them was to first add on columns used in joins. These columns most commonly refer to foreign keys, where, in some environments, such as Oracle SQL Developer, this is not done automatically. Additionally, we also added indexes to columns used in WHERE conditions.

Apart from single-column indexes, we also used composite indexes in more than one query to reduce the overall number of indexes (and their size).

Although we explored Bitmap indexes, they were not used since there isn’t any column that needs an index while having low cardinality, i.e., where columns have lots of duplicate values.

Finally, we also tried to experiment with MATERIALIZED VIEWS (reference to Query 3.1) that gave us better results but wasn’t explored in too much detail due to time constraints.

In conclusion, we learned a lot about query optimization in SQL and indexes during this project. In future work, we can further improve the results and keep learning, as this process could take some time before reaching an optimal solution for all queries.