Archive

Posts Tagged ‘AWS/boto3/json’

Script to generate CSV for Compute Optimizer data from a Json file

Below is the script to generate a CSV file from a JSON output. I wrote this script for generating CSV for collecting compute optimizer data so that each EC2 has one line of data in the CSV file. Later on this CSV file is uploaded to google sheet for further analysis.

Python script “reportComputeOptData.py” is called within shell script “reportComputeOptData.sh”.

Python Script

import sys
import json
import pandas as pd
## Env is set for proper console display
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
## Env Setting - Ends
jsonfile = str(sys.argv[1])
csvfile = str(sys.argv[2])
with open(jsonfile) as file:
    data = json.load(file)

df = pd.DataFrame(data['instanceRecommendations'])
for i,item in enumerate(df['utilizationMetrics']):
    for k in range(len(df['utilizationMetrics'][i])):
        #Add a new column with a default value and then add/update the value of that colm
        df.at[i,'utilizationMetrics_name_{}'.format(k)] = dict(df['utilizationMetrics'][i][k])['name']
        df.at[i,'utilizationMetrics_statistic_{}'.format(k)] = dict(df['utilizationMetrics'][i][k])['statistic']
        df.at[i,'utilizationMetrics_value_{}'.format(k)] = dict(df['utilizationMetrics'][i][k])['value']


    for m in range(len(df['recommendationOptions'][i])):
       df.at[i,'recommendationOptions_instanceType_{}'.format(m)] = dict(df['recommendationOptions'][i][m])['instanceType']
       df.at[i,'recommendationOptions_performanceRisk_{}'.format(m)] = dict(df['recommendationOptions'][i][m])['performanceRisk']
       df.at[i,'recommendationOptions_rank_{}'.format(m)] = dict(df['recommendationOptions'][i][m])['rank']
       for j in range(len(dict(df['recommendationOptions'][i][m])['projectedUtilizationMetrics'])):
           df.at[i,'reco_projectedUtilizationMetrics_{}_name_{}'.format(m,j)] = dict(dict(df['recommendationOptions'][i][m])['projectedUtilizationMetrics'][j])['name']
           df.at[i,'reco_projectedUtilizationMetrics_{}_statistic_{}'.format(m,j)] = dict(dict(df['recommendationOptions'][i][m])['projectedUtilizationMetrics'][j])['statistic']
           df.at[i,'reco_projectedUtilizationMetrics_{}_value_{}'.format(m,j)] = dict(dict(df['recommendationOptions'][i][m])['projectedUtilizationMetrics'][j])['value']

df = df.drop({'utilizationMetrics','recommendationOptions'}, axis=1)
df.to_csv(csvfile, header=True,index=False)
print("CSV File generated at..-  {}".format(csvfile))

Shell Script (which generates the json file which then parsed to python script to generate the CSV file)

#!/bin/sh
if [[ $# -lt 1 ]]; then
  echo "Usage: ${0} <AccountID> [<Region>]"
  exit 1
fi
NOW=$(date +"%m%d%Y%H%M")
AccountID=${1}
AWS_DEFAULT_REGION=${2} ## 3rd Argument is the Account Default Region is diff than the CLI server
script_top=/d01/app/aws_script/bin
outputdir=/d01/app/aws_script/output
csvfile=${outputdir}/${AccountID}_copt-${NOW}.csv
jsonfile=${outputdir}/${AccountID}_copt-${NOW}.json
#
## Reset Env variables
reset_env () {
        unset AWS_SESSION_TOKEN
        unset AWS_DEFAULT_REGION
        unset AWS_SECRET_ACCESS_KEY
        unset AWS_ACCESS_KEY_ID
} #end of reset_env
## Set Env function
assume_role () {
AccountID=${1}
source </path_to_source_env_file/filename> ${AccontID}
}
# Function assume_role ends
assume_role ${AccountID}
if [[ ! -z "$2" ]]; then
        AWS_DEFAULT_REGION='us-east-2'
fi
#
## Generate json file
aws compute-optimizer get-ec2-instance-recommendations | jq -r . >${jsonfile}
## Pass the json file to python script along with the CSV File for the output
python ${script_top}/reportComputeOptData.py ${jsonfile} ${csvfile}
echo "CSV File generated... - ${csvfile}"
reset_env

Json file format

{
“instanceRecommendations”: [
{
“instanceArn”: “arn:aws:ec2:eu-east-1:123404238928:instance/i-04a67rqw6c029b82f”,
“accountId”: “123404238928”,
“instanceName”: “testserver01”,
“currentInstanceType”: “c4.xlarge”,
“finding”: “OVER_PROVISIONED”,
“utilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 6.3559322033898304
}
],
“lookBackPeriodInDays”: 14,
“recommendationOptions”: [
{
“instanceType”: “t3.large”,
“projectedUtilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 12.711864406779661
}
],
“performanceRisk”: 3,
“rank”: 1
},
{
“instanceType”: “m5.large”,
“projectedUtilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 12.711864406779661
}
],
“performanceRisk”: 1,
“rank”: 2
},
{
“instanceType”: “m4.large”,
“projectedUtilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 15.645371577574968
}
],
“performanceRisk”: 1,
“rank”: 3
}
],
“recommendationSources”: [
{
“recommendationSourceArn”: “arn:aws:ec2:eu-east-1:123404238928:instance/i-04a67rqw6c029b82f”,
“recommendationSourceType”: “Ec2Instance”
}
],
“lastRefreshTimestamp”: 1583986171.637
},
{
“instanceArn”: “arn:aws:ec2:eu-east-1:123404238928:instance/i-0af6a6b96e2690002”,
“accountId”: “123404238928”,
“instanceName”: “TestServer02”,
“currentInstanceType”: “t2.micro”,
“finding”: “OPTIMIZED”,
“utilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 96.27118644067791
}
],
“lookBackPeriodInDays”: 14,
“recommendationOptions”: [
{
“instanceType”: “t3.micro”,
“projectedUtilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 39.1101694915254
}
],
“performanceRisk”: 1,
“rank”: 1
},
{
“instanceType”: “t2.micro”,
“projectedUtilizationMetrics”: [
{
“name”: “CPU”,
“statistic”: “MAXIMUM”,
“value”: 96.27118644067791
}
],
“performanceRisk”: 1,
“rank”: 2
}
],
“recommendationSources”: [
{
“recommendationSourceArn”: “arn:aws:ec2:eu-east-1:123404238928:instance/i-0af6a6b96e2690002”,
“recommendationSourceType”: “Ec2Instance”
}
],
“lastRefreshTimestamp”: 1583986172.297
}
],
“errors”: []
}

CSV file Output

Enjoy reading !!!
Anand M

Categories: AWS/Boto3/Python Tags: