Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
print("Sorted List")
numbers = sorted(numbers)
print(numbers)
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
- In this sort, the algorithm checks each value and moves the largest value to the right. It keeps on moving the value until there is a value which is not greater than the values on the right.
- This sorts based on largest value, moving them towards the right side.
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm (Geek for Geeks linked below) and be ready to explain it to your other group members.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
- Bubble sort works by switching elements of a list which are right next to each other. It starts with the first two indices and then keeps on moving up until the last index of the list is the greatest. This repeats until the entire list is sorted. In this case, 75 and 17 are first compared and swaps the position of 17 and 75. The list keeps on moving with 75, until a greater value is found, swapping the larger value index number with the smaller number index number. This steps repeats until the list is in order.
- Selection Sort
- In selection sort, the first value is set to be the smallest value. The algorithm goes through the entire list and checks for the smallest value. If one is found, it is set as the minimum value and the index numbers are swapped, putting the smallest one at index 1. The list keeps on repeating for every index, finding where each element would go. This list would set 75 as the minimum value and then check for the smallest one in the list, which would be 0, repeating until the list is in order.
Explain.
- In selection sort, the first value is set to be the smallest value. The algorithm goes through the entire list and checks for the smallest value. If one is found, it is set as the minimum value and the index numbers are swapped, putting the smallest one at index 1. The list keeps on repeating for every index, finding where each element would go. This list would set 75 as the minimum value and then check for the smallest one in the list, which would be 0, repeating until the list is in order.
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
- Merge Sort
- Merge sort works by splitting the list into shorter lists and sorting these shorter lists and then merging together and sorting the final list. First of all, the list is divided into smaller lists until the sublist has 1 or fewer elements in it. Then for each sublist, the elements are sorted. Finally, the smaller lists are merged and sorted finally to produce the original list in a sorted form. For this example, the list would first be split into parts of [88, 39, 53, 39, 58] and [43, 74, 81, 71, 51], then split up some more until there is one index or less in the sublists. Then, these sublists are sorted and then merged to get the final sorted list.
- Insertion Sort
- The insertion sort checks a sorted part of a list with an unsorted element and then inserts the element in the correct place. Insertion sort first assumes that the first element of the list is sorted and then checks the following element to add to the "sorted" list. For this case, it would start at 88 and assume 88 is sorted and then check whether 39 should go after or before 39. This keeps on repeating until the entire list is sorted.
Explain.
- The insertion sort checks a sorted part of a list with an unsorted element and then inserts the element in the correct place. Insertion sort first assumes that the first element of the list is sorted and then checks the following element to add to the "sorted" list. For this case, it would start at 88 and assume 88 is sorted and then check whether 39 should go after or before 39. This keeps on repeating until the entire list is sorted.
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
def selection_sort(strings):
for i in range(len(strings)):
min_idx = i
for j in range(i + 1, len(strings)):
if strings[j] < strings[min_idx]:
min_idx = j
strings[i], strings[min_idx] = strings[min_idx], strings[i]
return strings
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
print(selection_sort(words))
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
- Given the following lists...
- [0, 2, 6, 4, 8, 10]
- [Elephant, Banana, Cat, Dog, Apple]
- [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50] Select the algorithm you believe is best for each, explain.
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
- Lists and dictionaries are considered to be collection types in python. This is because lists and dictionaries can store multiple values of data in them, while primitives hold only one specific piece. Lists hold multiple elements and dictionaries hold multiple key-value pairs.
- Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
- This code is pass-by-reference. This is because the bubble sort algorithm will alter the actual list, thanks to the swap code. This is seen by the output as the original list is changed based on the key sorted by, and the sorted list is not separate from the original list.
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
-
Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = list_of_people[0]
# print list as defined
print("Original")
print(list_of_people)
for key in key_row: # finds each key in the row
print(key)
bubbleSort(list_of_people, key) # sort list of people
for item in list_of_people:
print(f"Team Name: {item.get('name')}, Age: {item.get('Age')}, City of Residence: {item.get('city')}")
def insertionSort(list, key):
n = len(list)
for i in range(1, n):
current = list[i]
j = i - 1
while j >= 0 and list[j].get(key) > current.get(key):
list[j + 1] = list[j]
j -= 1
list[j + 1] = current
return list
if __name__ == "__main__":
list_of_teams = [
{"name": "Patriots", "city": "New England", "SB Wins": 6},
{"name": "Seahawks", "city": "Seattle", "SB Wins": 1},
{"name": "Chiefs", "city": "Kansas City", "SB Wins": 3},
{"name": "Texans", "city": "Houston", "SB Wins": 0}
]
row_key = list_of_teams[0]
print("Original Dictionary")
print(list_of_teams)
for key in row_key:
print(key)
insertionSort(list_of_teams, key)
for item in list_of_teams:
print(f"Team Name: {item.get('name')}, City: {item.get('city')}, Super Bowl Wins: {item.get('SB Wins')}")
if __name__ == "__main__":
list_of_teams = [
{"name": "Patriots", "city": "New England", "SB Wins": 6},
{"name": "Seahawks", "city": "Seattle", "SB Wins": 1},
{"name": "Chiefs", "city": "Kansas City", "SB Wins": 3},
{"name": "Texans", "city": "Houston", "SB Wins": 0}
]
rowkey = list_of_teams[0]
print("Original Dictionary")
print(list_of_teams)
for key in rowkey:
print(key)
bubbleSort(list_of_teams, key)
for item in list_of_teams:
print(f"Team Name: {item.get('name')}, City: {item.get('city')}, Super Bowl Wins: {item.get('SB Wins')}")
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
keyrow = list_of_people[0]
# print list as defined
print("Original")
print(list_of_people)
for key in keyrow: # finds each key in the row
print(key)
insertionSort(list_of_people, key) # sort list of people
for item in list_of_people:
print(f"Team Name: {item.get('name')}, Age: {item.get('Age')}, City of Residence: {item.get('city')}")
Observations
I noticed from the code that obviously the code for the insertion sort appears to be shorter than the code for the bubble sort. However, there wasn't really a noticeable difference in the time it took to run the code. This can be seen because both bubble sort and insertion sort have time complexity of O^2. This is also the same as selection sort, but merge sort is O(nlogn). Merge sort would probably be much better for a larger database, like one containing data for all the NFL teams.
The insertion sort works by taking an element and then checking whether it is greater than an item in the list. The algorithm checks each element and then determines where to insert the element. This is repeated until the list is in order.
Given that insertion sort is O(n^2), it is fine for a small dictionary to be sorted. It would take much more time for a longer dictionary to be sorted when compared to using merge sort.
2D Arrays
Here is some code which I have created which utilizes 2D arrays in order to take images and piece them together. It receives a certain number of rows and columns based on user input and creates a 2D array of images based on this. It then selects random images and then combines them and displays them.
The idea between this code could be to create almost a virtual postcard which users could send. It could be modified to have user inputted words and include a link to our site. This code would probably need to be modified in order to have a url to send the request and then receive the request and take in user inputs for words.
from PIL import Image
import os
import random
# Directory containing the images
image_directory = "../images/"
# Get a list of image files in the directory
image_files = [f for f in os.listdir(image_directory) if os.path.isfile(os.path.join(image_directory, f))]
rows = int(input("Enter the number of rows (1-10): "))
cols = int(input("Enter the number of columns (1-10): "))
data = [[random.choice(image_files) for j in range(cols)] for i in range(rows)]
image_width = 50
image_height = 50
image = Image.new("RGB", (cols * image_width, rows * image_height))
for row in range(rows):
for col in range(cols):
image_file = data[row][col]
image_path = os.path.join(image_directory, image_file)
img = Image.open(image_path)
img = img.resize((image_width, image_height))
x1, y1 = col * image_width, row * image_height
x2, y2 = x1 + image_width, y1 + image_height
image.paste(img, (x1, y1, x2, y2))
print(f"{rows} rows")
print(f"{cols} columns")
display(image)
def merge_sort(dicts):
if len(dicts) <= 1:
return dicts
mid = len(dicts) // 2
left_half = dicts[:mid]
right_half = dicts[mid:]
left_half = merge_sort(left_half)
right_half = merge_sort(right_half)
return merge(left_half, right_half)
def merge(left, right):
merged = []
i = j = 0
while i < len(left) and j < len(right):
if left[i]['likes'] > right[j]['likes']: # Compare in descending order
merged.append(left[i])
i += 1
else:
merged.append(right[j])
j += 1
while i < len(left):
merged.append(left[i])
i += 1
while j < len(right):
merged.append(right[j])
j += 1
return merged
# Example list of dictionaries with images and likes
dicts = [
{'name': 'sky', 'image': '../sky.jpg', 'likes': 10},
{'name': 'ocean', 'image': '../ocean.jpg', 'likes': 25},
{'name': 'bird', 'image': '../bird.jpg', 'likes': 13},
{'name': 'runner', 'image': '../runner.jpg', 'likes': 5}
]
# Sort the list of dictionaries based on the 'likes' value using Merge Sort
sorted_dicts = merge_sort(dicts)
# Print the sorted list
for item in sorted_dicts:
print(f"Image Name: {item.get('name')}, Like Count: {item.get('likes')}")
How this Code can be Used
This code could be used for my group's project because it uses merge sort to check the amount of likes which an image has. This makes sure to order the database to display to the order the most liked images. This algorithm could probably be changed in order to order the database in order of things like likes in decreasing order or image name. Merge sort is effective because of the lower time complexity than other sorts, which will make it efficient for the project because it has a lower run time for a larger data set. We will need to process a lot of image data, so this is why it would be important.
This code would be needed to be changed in order to be able to fetch image data from a url. This will some results to be changed as it fetchs the json data.