CSC 148 2014 winter class: March 2014

Sunday, March 30, 2014

slog finale on sorting algorithm

How time flies! Here comes the final week of class and also my slog finale.

Compared with recursion, I find sorting a easier part in 148 that follow smoothly from 108. In 108, we learnt three simple but less efficient sorting algorithms. Apart from a revision, we are also introduced merge, quick and count sort. I have explained merge and quick sort in detail in the previous two slog entries. As for count sort, it is significantly efficient when the given numbers are within a certain range. It basically records the occurrences of each number in its corresponding position in a new list. Then output a sorted list according to the sequence and occurrence of numbers.

Surprisingly, in last week's lab, according to the graph we plot, insertion sort is less efficient than selection sort. I thought insertion has a best sase and a worst case while selection need to do all the comparisons, and it seems that the result is contrary to what I expected. A possible explanation to this question might be that insertion involves shifting each item behind the insertion point backwards while selection only need to change the items in two positions. Thus, they take different amount of time for shifting and exchanging values.

In addition, we were making small adjustments to the existing sorting codes to improve efficiency. For example, in merge, replacing "while L1[i1:] and L2[i2:]" with "i1 < len(L1) and i2 < len(L2)" could make a great difference because each time L[:a] is a linear copy of the list which takes O(n) time. As guided in the last lab, avoid redundancy is one way to improve efficiency.

To understand sorting, it is useful to compare the number of division AND comparison in order to analyze efficiency. The practical example would be the last in the second midterm, of which i gain insights from merge sort and quick sort.

Question:
1. why would in place be faster than creating a new list?
2. what about collision? if we have fifty percent chance of collision after 23 items, how long a list do we need? how does probing a list works?

Saturday, March 22, 2014

Week 10 more on sorting

My handwork from last weekends paid off in letting competing A2 in advance. Though I haven't understand the codes for permutation fully, utilization is sweet.

Nevertheless, I still benefit from Monday's class in finding a significant flaw in my code. An insight in regex_match when it's used on a StarTree is that s is supposed to be a concatenation of si where every si match r. Initially, I only picture the simplest case with a number flowed by *. Then, I realized it could also be (0.1)*, which will watch every string that have a multiple copy of "01". Still, this part is tricky and I couldn't have done it without Danny's hint, which is to think of s as a combination of s1 and s2, where s1 matches r and s2 matches the rest of r+*. Computer science really requires the rigorous consideration of matters.

In addition, we are introduced two new method of sorting, which are considerably more efficient.

Quick sort: as the name tells, i think it's the most efficient sorting algorithm we get to know so far. The main idea here is to pick out a pivot and divide a list into halves with the first half containing elements smaller than the pivot and the second half containing elements larger than the pivot. Repeating the steps on the two lists leads us to the base case where there is only one element. In conclusion, to divide the list till the smallest unit requires log n steps. Meanwhile, after each division, we have to do a thorough comparison throughout the whole list with the pivot. So altogether, quick sort requires n* log n steps. Interestingly, we later modify the codes to let it choose a random pivot in the list. Picture a scenario where the list is in reverse order, we starts with the first element every time. The rest of the list are all smaller than the pivot, thus, the whole process would take up to n^2 times, which is similar to insertion, selection and bubble.

Merge sort: This is a tough one. It takes me a while to understand it. Starting from the basic, merging means taking two already sorted list, choose the smallest items between the smallest respectively. And take it out to be the very first item of a new list. Merge sort also shares some of quick sort's ideas in that they have the same general structure to divide a list into halves and also the same base case. One difference is, though, merge sort always split from the middle while quick sort can pick a spot to split. Tracing backwards, the most tedious scenario merges every couple of number together, which takes n comparisons. To sum up, merge sort also have O(n log n).

From this week's reading, here is a useful note on insertion:

Shifting versus exchanging is also important. In general, a shift operation requires approximately a third of the processing work of an exchange since only one assignment is performed. In benchmark studies, insertion sort will show very good performance.

Tuesday, March 18, 2014

Week 9 continue on binary search tree and introduction to sorting

For BST, when we are writing the codes, it's important to keep in mind the general structure of BST, which is essential to its amazing efficiency. As long as we understand the logic in there, we can arrive at distinctive cases to consider and thus, carry out the codes. With regards to the delete function, I am still troubled by aliasing problem. If we leave out return_node, and instead, directly return note.right/node.left would results in the wrong outcome. I would appreciate if anyone could explain this to me.

In the meanwhile, we starts on sorting this week. What confused me at first is how we take consideration of the efficiency of sorting since comparison, swapping exists side by side in the code. And the answer is, consider the number of comparisons. I will provide a summary below about the review of sorting algorithm we leant since 108.
Selection sort: the basic idea is the select the smallest item to the front starting at index 0. thus, in the first round, there would be n-1 comparison, n-2,n-2 follows. Add them together, we get a quadratic equation.
Insertion sort: start at the second item, we look at the next item and compare it with the precious one, if smaller, shift forwards, and repeat the steps. As a result, there exist a worst case and a optimal case, corresponding to backwards and already sorted. For the worst case, the number of comparison would be the same as selection, meanwhile, the time for swapping is unneglectable. On the contrary, the optimal case would only require n-1 comparison.
bubble sort: either bubble the smallest to the front or bubble the largest to the end by comparing adjacent items. Thereby, it would take almost the same amount time as selection to come to a sorted list.
Notably, Danny mentioned in class that powers are acceptable, but not exponential.

Saturday, March 8, 2014

week 8 on linked structure

Last week, a typical nested linked list is constructed such that self.head points to a value and self.rest points to the subsequent linked linked. Th key thing is that each linked list end with an empty linked list.

This week,we investigates a different way of representing a linked list with the help of the creation of a new class node. In my understanding, node works in a similar way as last week's linked list. And the whole node structure is stored in one of linked list's attribute.
Similarly, binary tree can be repressed suing BTNode. The idea seems to be to transfer a recursive structure to a different class, though I don't get why we have to do so. In the printing method, in order to demonstrate a visually appealing graph, a contra in order traversal is used here to print the right side first. We further apply BTNode to BST(binary search tree). As a result, binary search tree is no longer a true structure, but more likely a BTNode recursive structure. But regarding the _insert function, why do we have to create a return_node and let it equal to node? What's wrong with directly modify node

Question:
1.I find in lab that COPY in linked list can be tricky, because each linked list occupies a unique id address. Changing its value, however, could cause aliasing(if that's the correct word). So I am wondering should we use deep copy instead of the shallow copy method?
Here is and illuminating answer i copy down from piazza: "Shallow copy - your first code chunk has 3 assignments and 1 shallow copy. copy.copy(x) creates a new object y, with id(y) ≠ id(x), such that for every attribute s of x, id(x.s) = id(y.x). So y is a distinct copy of x in basically the most superficial (shallow) way possible."
2. I am having trouble understanding the code for reversing a linked list.
3. when i can an object of a class in python shell, does it return __str__ or __repr__ of the object

Saturday, March 1, 2014

Week 7 on recursion

As the saying goes practice makes perfect, I feel like I am currently exercising this idiom on recursion. Though perfect is well ahead of me, I can a great leap I took in understanding and applying of recursion. I still remember the day when the prof introduced recursion with nested list. I was so frustrated, confused and lost and recursion became what my friend and I were discussing on a friday night dinner.

Till now, each week's material is closely related to recursion. I think I cover partially in each week's post, just for your information. Since this week we had a term test, I think everybody did a systematic review over the past 7 weeks' material. Undoubtedly, my understanding of recursion is enhanced through this practice, particularly by looking back to all the slides and labs.

In my way of solving the recursive problem, it is necessary to start from the simplest. For example, given a problem with nested list, we can start with a 0 nested list(just a simple list) to see what's the question is asking for. A little sketching or drawing never hurts.
Speaking of the essence of recursion, the next step is what the professor call the recursion insight, which is how we build up recursion on the previous case. In this section, there are generally two ways of writing the codes. Most of the examples given in class is written in the form of list comprehension. I find this a little difficult to understand at first, but after a long time training, sometimes this is the only solution I can think of. List comprehension requires one to be familiarized with the many methods list has, namely, sum, any, all...Nevertheless, it has its constraint when the problem become more complicated all the scenario can't be covered in one or two lines. This is when we should adopt the block method. I remembered I was stuck the third question in lab 4 for so long. It made me realize the the importance of flexibility in choosing which method to use in writing code. Notably, in a block from, it is importance to have a return statement when you call the function itself or whenever you wan the result in this step to be used in the nest. I made a lot of mistakes here and just to remind everyone.
Finally, I finish up recursion with the base case. I know I am kind of doing the second and third steps backwards and ocaasionally, I even forget about the base case. Interestingly, sometimes the base case is latently wired in the code that doesn't require significant efforts to write it. E.g. when we want to sum up all the items in a nested list. The base case is automatically sum(a certain list).

Below is a interesting example (kind of tricky) I found in the tree structure codes:
counting the leaves in a tree: return sum(count(i) for i in t.children) + (0 if t.children else 1)
The simplest case is where the height of the tree is 0. This t.children == False, return 1
Whereas the recursion insight is more mathematically—count the leaves of each of its children.
The trickiest part i find is who the data is accumulated. When I was thinking about this question myself, I didn't think of the second part of the return statement. This elaborate design accumulates nothing if a tree has any children, but when it reaches the leaves, return 1. Then sum it up, thereby competing the code.

Recursion is indeed an artistic design and a product of wisdom.