Commit 8e94ab3c authored by noureen.taj's avatar noureen.taj 🖖🏻

Update assingment_4.md

parent ee44dad6
# Assignment 4 # Assignment 4
Hey there! Welcome to Knowledge Lens Intern Training Program.
This Assignment will serve as a quick refresher on the usage of NoSQL and Time-series databases.
There are three tasks in this assignment, on completion of which you'll learn:
* How to interact with Mongo DB
* Using Pandas Dataframe and generating your own excel reports
* Leveraging Kairos Time-series database for data ingestion and querying the same
* Publishing and Consuming messages via MQTT protocol
* Caching mechanism using Redis DB
## Task 1: Working with Mongo - Advanced Happy Coding! :tada:
## :pushpin: Task 1: Working with Mongo - Advanced
## Areas covered:
### :golf: Areas covered:
- Working with NoSQL - Working with NoSQL
- Working with Pandas - Working with Pandas
## Description: ### :books: Description:
You are given with a semester deatils in a JSON format, write FAST APIs for the below : You are given with semester details in a JSON format, write FAST APIs for the below :
1. To accept the semester details JSON and insert it in a collection 1. To accept the semester details JSON and insert it in a collection
2. To get sum and average of all marks filerter by any of "student_id", "batch_id", "semster_id", "subject_id" 2. To get sum and average of all marks filerter by any of "student_id", "batch_id", "semster_id", "subject_id"
...@@ -80,32 +92,35 @@ Sample Document: ...@@ -80,32 +92,35 @@ Sample Document:
} }
``` ```
Note: Perform all filter operations on Mongo itself Note: Perform all filter operations on Mongo.
Bonus Points: Use Mongo Aggregate framework,
### Tools to use: ### :wrench: Tools to use:
1. Pycharm / VSCode 1. Pycharm / VSCode
2. Robo3T / Studio3T / MongoDB Compass 2. Robo3T / Studio3T / MongoDB Compass
3. PyMongo 3. PyMongo
### Reference: ### :mag: References:
https://www.mongodb.com/docs/manual/tutorial/query-documents/ * [Querying Documents on Mongo](https://www.mongodb.com/docs/manual/tutorial/query-documents/)
https://www.mongodb.com/docs/manual/reference/operator/aggregation-pipeline/ * [Quick Summary on Mongo Aggregation Stages](https://www.mongodb.com/docs/manual/reference/operator/aggregation-pipeline/)
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html * [Generating Excel Sheets from a Pandas Dataframe](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html)
https://fastapi.tiangolo.com/advanced/custom-response/#fileresponse * [How to return files on FastAPI response](https://fastapi.tiangolo.com/advanced/custom-response/#fileresponse)
https://pymongo.readthedocs.io/en/stable/ * [PyMongo Official Documentation](https://pymongo.readthedocs.io/en/stable/)
________________________________________________________ _________________________________
## Task 2: Working with Timeseries
## :pushpin: Task 2: Working with Time-series
## Areas covered:
### :golf: Areas covered:
- Timeseries Operation - Timeseries Operation
- Working with Timeseries - Working with Timeseries
- Working with Pandas - Working with Pandas
## Description: ### :books: Description:
You are given with a dataset of temperature in the form of a CSV file. The end goal of the project is to create an API interface that will provide the following: You are given with a dataset of temperature in the form of a CSV file. The end goal of the project is to create an API interface that will provide the following:
1. Get daily, weekly and monthly aggregate (min, max, and average) of the data filter by "good" data points and generate report in Excel format. 1. Get daily, weekly and monthly aggregate (min, max, and average) of the data filter by "good" data points and generate report in Excel format.
...@@ -120,26 +135,40 @@ Sample Document: ...@@ -120,26 +135,40 @@ Sample Document:
|2022-04-03 00:02:00.000 | 111.32| good | |2022-04-03 00:02:00.000 | 111.32| good |
|2022-04-03 00:03:00.000 | 114.98| bad | |2022-04-03 00:03:00.000 | 114.98| bad |
### Tools to use: ### :wrench: Tools to use:
1. Pycharm / VSCode 1. Pycharm / VSCode
2. Pandas 2. Pandas
3. Kairosdb 3. Kairos
### :mag: References:
* [How to query Kairos DB using Metrics](https://kairosdb.github.io/docs/restapi/QueryMetrics.html)
### Reference: ------------------------------------------------------
https://kairosdb.github.io/docs/restapi/QueryMetrics.html
https://pypi.org/project/kairosdb-python/
________________________________________________________ ## :pushpin: Task 3: Working with MQTT & REDIS
## Task 3: Working with MQTT
### :golf: Areas covered:
- MQTT Protocol
- Caching using Redis DB
### :books: Description
For the sample given in Task 2:
1. Push a message to the MQTT for every 5th successful record with "good" data quality.
2. Message should contain stats (sum, average, timestamp - latest message) of all above 5 good records.
3. Store each aggregation to a separate redis database.
4. Develop an API to fetch the above saved data from redis DB.
### :wrench: Tools to use:
1. Pycharm / VSCode
2. MQTT - (PIP package: `paho-mqtt`)
3. REDIS - (PIP package: `redis`)
### :mag: References:
* [Using MQTT in Python](https://www.emqx.com/en/blog/how-to-use-mqtt-in-python)
* [Connection to Redis in Python](https://docs.redis.com/latest/rs/references/client_references/client_python/)
## Areas covered:
- MQTT operation
## Description:
For the above (task 2) given sample
1. Push a message to the MQTT for every 5th successful record with "good"
2. Message contains stats (sum, average, timestamp - latest message) of all above 5 good records
3. store each aggregation to a separate redis database
4. develop an API to fetch the above saved data from redis
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment