# Assignment 3

Hey there! Welcome to Knowledge Lens Intern Training Program.

This Assignment will serve as a quick refresher on the usage of NoSQL and Time-series databases.
There are three tasks in this assignment, on completion of which you'll learn:
*  How to interact with Mongo DB
*  Using Pandas Dataframe and generating your own excel reports
*  Leveraging Kairos Time-series database for data ingestion and querying the same
*  Publishing and Consuming messages via MQTT protocol
*  Caching mechanism using Redis DB 

Happy Coding! :tada:

## :pushpin: Task 1: Working with Mongo - Advanced


### :golf: Areas covered:
- Working with NoSQL
- Working with Pandas

### :books: Description:


You are given with a dataset of a restaurant review in the form of a JSON file. The end goal of the project is to create an API interface that will provide the following: 
1. Business name with maximum number of highest average review.
2. Which cuisine has the highest number of restaurants?
3. Generate Excel Report based on Cuisine, Name and borough
Sample Document:
```json
{
 "address":  {
 "building":  "120",
 "coord":  [
 -73.9998042,
 40.7251256
 ],
 "street":  "Prince Street",
 "zipcode":  "10012"
 },
 "borough":  "Manhattan",
 "cuisine":  "Bakery",
 "grades":  [
 {
 "date":  {
 "$date":  "2014-10-17T00:00:00.000Z"
 },
 "grade":  "A",
 "score":  11
 },
 {
 "date":  {
 "$date":  "2013-09-18T00:00:00.000Z"
 },
 "grade":  "A",
 "score":  13
 },
 {
 "date":  {
 "$date":  "2013-04-30T00:00:00.000Z"
 },
 "grade":  "A",
 "score":  7
 },
 {
 "date":  {
 "$date":  "2012-04-20T00:00:00.000Z"
 },
 "grade":  "A",
 "score":  7
 },
 {
 "date":  {
 "$date":  "2011-12-19T00:00:00.000Z"
 },
 "grade":  "A",
 "score":  3
 }
 ],
 "name":  "Olive'S",
 "restaurant_id":  "40363151"
}
```
Bonus Points: Use Mongo Aggregate framework

### :wrench: Tools to use: 
1. Pycharm / VSCode
2. Robo3T / Studio3T / MongoDB Compass
3. PyMongo


### :mag: References:
* [Querying Documents on Mongo](https://www.mongodb.com/docs/manual/tutorial/query-documents/)
* [Quick Summary on Mongo Aggregation Stages](https://www.mongodb.com/docs/manual/reference/operator/aggregation-pipeline/)
* [Generating Excel Sheets from a Pandas Dataframe](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html)
* [How to return files on FastAPI response](https://fastapi.tiangolo.com/advanced/custom-response/#fileresponse)
* [PyMongo Official Documentation](https://pymongo.readthedocs.io/en/stable/)



_________________________________

## :pushpin: Task 2: Working with Time-series


### :golf: Areas covered:
- Timeseries Operation
- Working with Timeseries
- Working with Pandas

### :books: Description:

You are given with a dataset of weather in the form of a CSV file. The end goal of the project is to create an API interface that will provide the following:

1. Get daily, monthly, weekly and monthly aggregate (min, max, and average) of the data and generate report in Excel format.

Sample Document:

|Datetime        |AEP_MW|
|----------------|------|
|31/12/2004 01:00|13478 |
|31/12/2004 02:00|12865 |


### :wrench: Tools to use: 
1. Pycharm / VSCode
2. Pandas 
3. Kairos

### :mag: References:
* [How to query Kairos DB using Metrics](https://kairosdb.github.io/docs/restapi/QueryMetrics.html)


------------------------------------------------------

## :pushpin: Task 3: Working with MQTT & REDIS

### :golf: Areas covered:
- MQTT Protocol
- Caching using Redis DB

### :books: Description

Data from different sites will be pushed with frequency of 10 seconds for the parameters PM10,PM2.5,SO2,NO2 via mqtt.

data can be of different quality - Good ( 0 ), Maintainance ( 1 ), Error ( 2 )

Based on the quality of data update to different Redis database.

Sample data format: 
```json
  {
		"data" : { "PM10" : 100 , "PM2.5": 23, "SO2":21, "NO2": 32}
		"site_id" : "site_100",
		"data_quality": 1
		}
```
	   
Use Redis for caching/storing information

Create consumer's which consumes data from these topics and store to a Redis db based on data quality.

### :wrench: Tools to use: 
1. Pycharm / VSCode
2. MQTT - (PIP package: `paho-mqtt`)
3. REDIS - (PIP package: `redis`)

### :mag: References:
* [Using MQTT in Python](https://www.emqx.com/en/blog/how-to-use-mqtt-in-python)
* [Connection to Redis in Python](https://docs.redis.com/latest/rs/references/client_references/client_python/)



