To make sense of data, we frequently organize information in lists and perform numerical operations such as add, min, max, average on them. We can do these operations using builtin python lists and loops as discussed in previous tutorials, but why this overhead when we can do most such operations with just one function call using Numpy. Also, Looping through huge arrays (millions of records) for such operations becomes extremely slower without using optimized library as Numpy. Numpy makes it a lot easier for us to make such computations. Using this library, we can write few lines of code complete the analysis we are performing.
Numpy also allows us to easily access a portion of data using indexing and perform operations on that portion of data. Perfoming operations on a portion of data, especially when there are multiple lists, becomes cumbersome using builtin python lists. For instance, finding heights of students meeting some conditions like having grade between 50 and 70.
Example #1: MOAR Looping!
Let say a class has the following grades where all students failed. The teacher decides to double the grade of each student to pass some. Here is how you would have done it using builtin arrys and loops.You can it do the same using numpy without looping over data yourself. Here is how you can do it with numpy.
import numpy as np
grades = [20,10,30,40,10,20,12,14,15,16,14,12,16]
#old and busted
new_grades = []
for grade in grades:
new_grades.append(grade*2)
print(new_grades)
#new hotness
grades = np.array(grades)
new_grades = grades*2
print(new_grades)
Example #2: Descriptive Statistics
Finding min,max,avg and other such numerical information of grades becomes a lot easier.
min_grade = grades.min()
max_grade = grades.max()
avg_grade = grades.sum()/len(grades)
print("Min grade: ",min_grade)
print("Max grade: ",max_grade)
print("Avg grade: ",avg_grade)
Example #3: Selecting And Filtering Values
Let's select all students having grades less than 25 and let's find max gradeof students meeting the above condition.
grades[grades<25]
grades[grades<25].max()
Copyright © 2020, Mass Street Analytics, LLC. All Rights Reserved.