In this project you will assist a pet rescue charity with managing their donations related data. The charity organizes an annual donation drive. The city is divided to donations areas (every donation area is approximately 6 postal codes) and assign a group of volunteers to every area. The volunteers will go door to door in their designated areas in order to collect donations from the residents. They can collect cash, checks, or credit card payments. At the end of every week the volunteering group leaders will fill in a list with the donation record that were collected in their area, and send it to the charity main office. The staff in the main office will load the list to a central table after rejecting non-valid entries and then use the data to perform analysis.
Central Donations Repository
The tables created by projectTables.sql script reside in the charity’s Oracle database server as well as the list of the volunteers.
The server is on host named db8.fast.sheridanc.on.ca. Each group is assigned their own database. The credentials to use are:
Server Name: db8.fast.sheridanc.on.ca Port: 22 – SSH, 1521 – Oracle Linux username: userxx Password: Or4cl3S2020 Oracle instance name: Uxx.world Each instance has the following usernames/passwords: • sys/[same as ssh] • system/[same as ssh] • prc/PetRescue • vl1/PetRescue • vl2/PetRescue • vl3/PetRescue • vl4/PetRescue • dml/DataMart
Where the linux username and Oracle instance contain xx – replace it with your group #. Users SYS and SYSTEM are for database administration. PRC is to store the addresses and donors. VL1-VL4 are the accounts for the volunteer leaders. DML is to be used to store that datamart
The list of donors is stored in a comma separated file. A sample named donorsList.csv shows a list that the volunteer group lead sends to the main office. It illustrates the schema and contains only two entries. DONOT USE EITHER OF THOSE ENTRIES IN YOUR SUBMISSION. Each member of the group (presumably a volunteer with a distinct volunteer no) should provide a separate list for their “area” with at least 15 entries. As a result each group will have three (or four) donor lists that contain a minimum of 15 additional entries in the donors list. Make sure that there are both valid and invalid entries. The invalid entries will be rejected and sent back to the respective volunteer coordinator. In total there must be at LEAST 60 distinct names/addresses on at LEAST 3 separate lists.
Master Addresses Table
The address table that the charity maintains is not updated and it often gets out of date. However, there is another department in the organization has an address table that they regularly update and keep current. The table is in SQLServer database and below is the connection information to the server
Server Name: dbr.fast.sheridanc.on.ca Port: 1433 Database name: Integration Schema Name: dbo User Name: DataIntegrator Password: Sher1dan
- Refresh the address table in the group’s Oracle database with the addresses in the master table. You will need to transform some data types and generate a sequential id for the address. Note that the:
- ids of some addresses will be used as foreign key in donation table so your solution should accommodate this fact.
- the master table does not contain postal codes so your solution should accommodate this fact
- Create a process that loads the donations list to the central donations table: In this task you will load each donor file into the central repository (Oracle tables). You need to make sure that only the donations with valid addresses are inserted into the table. Donation records with erroneous addresses must be rejected. Also make sure to reject the donations that have nulls for the mandatory columns in your database (e.g. NOT NULL columns). You must generate a csv file corresponding to the volunteer group leaders with the records that were rejected in their area.
- Create a star schema for the donations. The grain of the schema should be the combination of day, address, and volunteer containing the sum and average amounts
- Create a process to load the data to the star schema from the central donation repository
- Create views that shows
- The average and sum of the donation by day, month, year
- The average and sum of the donations by address, postal code
- The average and sum of the donations by volunteer and volunteer group leader
- Basic Security
- Create a user named DMLUser and give the user permissions to implement all DML on address, donation, and volunteer tables
- Create a user named Dashboard and give the user read permissions on the views
NOTE: All tasks (1-6) are to be performed using your groups Oracle database.
Proposal (Group Work)
In the past few weeks you learned advanced SQL statements, PL/SQL and Talend Data Integrator. You can use a combination of the three tools to accomplish tasks 1,2, and 4. In order to start the project you need to research these technologies and decide what combination of technologies you will use for every task. You are required to write a one or two pages proposal that must include the below information:
- Who is responsible for each task? The work should be assigned in a balanced manner between the group members.
- Proposed due date for every task.
- Dependencies between the tasks.
- How do you plan to complete tasks 1,2 and 4. in this part you need to describe the process that you intend to follow in order to compete the tasks.
You will be provided with a template for the proposal. The proposal will be submitted by the due date on the Assignment on SLATE.
Report (Group Work)
- Create a Microsoft word report documenting the processes. The report must include 6 sections. Each of the section will be describing one of the 6 tasks. You must describe the processes in details. Use embedded screenshot and text descriptions in the document to illustrate the ETL processes. The screenshots must be proportional to the page and yet must be clear. Restrict the screen shots only to the areas that are relevant to the description.
- An export for your TOS project as a zipped file. Use the “Export Projects” button on the menu bar to make an Archive File to export resources to an archive file on your local file system. Make the file name gggggg_ssssss.zip where gggggg is your group (e.g. group1) and ssssss is the section number of this course. Include the project component and each of its routines (e.g. GROUPPROJECT and GROUPPROJECT_ROUTINES)
- A zip file containing all the donation lists files (.csv) and their companion reject lists (also .csv)
- A zip file containing all SQL and/or PL/SQL code used for the Oracle servers
You must write a one or two pages summary that describes your design for the project in your own words and details your involvement in the project. The summary must have the below 4 sections:
- A brief description of your group’s solution to the project, what are the differences between your final solution and the proposal if any, and why did not you partially follow your proposal
- Your contribution to the project. You will get the full grade on this item if your contribution is proportionate to the other group members and if it is well described. You will get a bonus grade if your contribution is greater that the other members in the group
- State the difficulties that you faced in the project
- Suggest improvements to the pet rescue operation data collection process. The suggestions must be related to the scope of this project.
The professor can ask to interview the group if some of the details are not clear about the project implementation. The result of the interview will be considered in the grade