CSC 148 2014 winter class

Sunday, March 30, 2014

slog finale on sorting algorithm

How time flies! Here comes the final week of class and also my slog finale.

Compared with recursion, I find sorting a easier part in 148 that follow smoothly from 108. In 108, we learnt three simple but less efficient sorting algorithms. Apart from a revision, we are also introduced merge, quick and count sort. I have explained merge and quick sort in detail in the previous two slog entries. As for count sort, it is significantly efficient when the given numbers are within a certain range. It basically records the occurrences of each number in its corresponding position in a new list. Then output a sorted list according to the sequence and occurrence of numbers.

Surprisingly, in last week's lab, according to the graph we plot, insertion sort is less efficient than selection sort. I thought insertion has a best sase and a worst case while selection need to do all the comparisons, and it seems that the result is contrary to what I expected. A possible explanation to this question might be that insertion involves shifting each item behind the insertion point backwards while selection only need to change the items in two positions. Thus, they take different amount of time for shifting and exchanging values.

In addition, we were making small adjustments to the existing sorting codes to improve efficiency. For example, in merge, replacing "while L1[i1:] and L2[i2:]" with "i1 < len(L1) and i2 < len(L2)" could make a great difference because each time L[:a] is a linear copy of the list which takes O(n) time. As guided in the last lab, avoid redundancy is one way to improve efficiency.

To understand sorting, it is useful to compare the number of division AND comparison in order to analyze efficiency. The practical example would be the last in the second midterm, of which i gain insights from merge sort and quick sort.

Question:
1. why would in place be faster than creating a new list?
2. what about collision? if we have fifty percent chance of collision after 23 items, how long a list do we need? how does probing a list works?

Saturday, March 22, 2014

Week 10 more on sorting

My handwork from last weekends paid off in letting competing A2 in advance. Though I haven't understand the codes for permutation fully, utilization is sweet.

Nevertheless, I still benefit from Monday's class in finding a significant flaw in my code. An insight in regex_match when it's used on a StarTree is that s is supposed to be a concatenation of si where every si match r. Initially, I only picture the simplest case with a number flowed by *. Then, I realized it could also be (0.1)*, which will watch every string that have a multiple copy of "01". Still, this part is tricky and I couldn't have done it without Danny's hint, which is to think of s as a combination of s1 and s2, where s1 matches r and s2 matches the rest of r+*. Computer science really requires the rigorous consideration of matters.

In addition, we are introduced two new method of sorting, which are considerably more efficient.

Quick sort: as the name tells, i think it's the most efficient sorting algorithm we get to know so far. The main idea here is to pick out a pivot and divide a list into halves with the first half containing elements smaller than the pivot and the second half containing elements larger than the pivot. Repeating the steps on the two lists leads us to the base case where there is only one element. In conclusion, to divide the list till the smallest unit requires log n steps. Meanwhile, after each division, we have to do a thorough comparison throughout the whole list with the pivot. So altogether, quick sort requires n* log n steps. Interestingly, we later modify the codes to let it choose a random pivot in the list. Picture a scenario where the list is in reverse order, we starts with the first element every time. The rest of the list are all smaller than the pivot, thus, the whole process would take up to n^2 times, which is similar to insertion, selection and bubble.

Merge sort: This is a tough one. It takes me a while to understand it. Starting from the basic, merging means taking two already sorted list, choose the smallest items between the smallest respectively. And take it out to be the very first item of a new list. Merge sort also shares some of quick sort's ideas in that they have the same general structure to divide a list into halves and also the same base case. One difference is, though, merge sort always split from the middle while quick sort can pick a spot to split. Tracing backwards, the most tedious scenario merges every couple of number together, which takes n comparisons. To sum up, merge sort also have O(n log n).

From this week's reading, here is a useful note on insertion:

Shifting versus exchanging is also important. In general, a shift operation requires approximately a third of the processing work of an exchange since only one assignment is performed. In benchmark studies, insertion sort will show very good performance.

Tuesday, March 18, 2014

Week 9 continue on binary search tree and introduction to sorting

For BST, when we are writing the codes, it's important to keep in mind the general structure of BST, which is essential to its amazing efficiency. As long as we understand the logic in there, we can arrive at distinctive cases to consider and thus, carry out the codes. With regards to the delete function, I am still troubled by aliasing problem. If we leave out return_node, and instead, directly return note.right/node.left would results in the wrong outcome. I would appreciate if anyone could explain this to me.

In the meanwhile, we starts on sorting this week. What confused me at first is how we take consideration of the efficiency of sorting since comparison, swapping exists side by side in the code. And the answer is, consider the number of comparisons. I will provide a summary below about the review of sorting algorithm we leant since 108.
Selection sort: the basic idea is the select the smallest item to the front starting at index 0. thus, in the first round, there would be n-1 comparison, n-2,n-2 follows. Add them together, we get a quadratic equation.
Insertion sort: start at the second item, we look at the next item and compare it with the precious one, if smaller, shift forwards, and repeat the steps. As a result, there exist a worst case and a optimal case, corresponding to backwards and already sorted. For the worst case, the number of comparison would be the same as selection, meanwhile, the time for swapping is unneglectable. On the contrary, the optimal case would only require n-1 comparison.
bubble sort: either bubble the smallest to the front or bubble the largest to the end by comparing adjacent items. Thereby, it would take almost the same amount time as selection to come to a sorted list.
Notably, Danny mentioned in class that powers are acceptable, but not exponential.

Saturday, March 8, 2014

week 8 on linked structure

Last week, a typical nested linked list is constructed such that self.head points to a value and self.rest points to the subsequent linked linked. Th key thing is that each linked list end with an empty linked list.

This week,we investigates a different way of representing a linked list with the help of the creation of a new class node. In my understanding, node works in a similar way as last week's linked list. And the whole node structure is stored in one of linked list's attribute.
Similarly, binary tree can be repressed suing BTNode. The idea seems to be to transfer a recursive structure to a different class, though I don't get why we have to do so. In the printing method, in order to demonstrate a visually appealing graph, a contra in order traversal is used here to print the right side first. We further apply BTNode to BST(binary search tree). As a result, binary search tree is no longer a true structure, but more likely a BTNode recursive structure. But regarding the _insert function, why do we have to create a return_node and let it equal to node? What's wrong with directly modify node

Question:
1.I find in lab that COPY in linked list can be tricky, because each linked list occupies a unique id address. Changing its value, however, could cause aliasing(if that's the correct word). So I am wondering should we use deep copy instead of the shallow copy method?
Here is and illuminating answer i copy down from piazza: "Shallow copy - your first code chunk has 3 assignments and 1 shallow copy. copy.copy(x) creates a new object y, with id(y) ≠ id(x), such that for every attribute s of x, id(x.s) = id(y.x). So y is a distinct copy of x in basically the most superficial (shallow) way possible."
2. I am having trouble understanding the code for reversing a linked list.
3. when i can an object of a class in python shell, does it return __str__ or __repr__ of the object

Saturday, March 1, 2014

Week 7 on recursion

As the saying goes practice makes perfect, I feel like I am currently exercising this idiom on recursion. Though perfect is well ahead of me, I can a great leap I took in understanding and applying of recursion. I still remember the day when the prof introduced recursion with nested list. I was so frustrated, confused and lost and recursion became what my friend and I were discussing on a friday night dinner.

Till now, each week's material is closely related to recursion. I think I cover partially in each week's post, just for your information. Since this week we had a term test, I think everybody did a systematic review over the past 7 weeks' material. Undoubtedly, my understanding of recursion is enhanced through this practice, particularly by looking back to all the slides and labs.

In my way of solving the recursive problem, it is necessary to start from the simplest. For example, given a problem with nested list, we can start with a 0 nested list(just a simple list) to see what's the question is asking for. A little sketching or drawing never hurts.
Speaking of the essence of recursion, the next step is what the professor call the recursion insight, which is how we build up recursion on the previous case. In this section, there are generally two ways of writing the codes. Most of the examples given in class is written in the form of list comprehension. I find this a little difficult to understand at first, but after a long time training, sometimes this is the only solution I can think of. List comprehension requires one to be familiarized with the many methods list has, namely, sum, any, all...Nevertheless, it has its constraint when the problem become more complicated all the scenario can't be covered in one or two lines. This is when we should adopt the block method. I remembered I was stuck the third question in lab 4 for so long. It made me realize the the importance of flexibility in choosing which method to use in writing code. Notably, in a block from, it is importance to have a return statement when you call the function itself or whenever you wan the result in this step to be used in the nest. I made a lot of mistakes here and just to remind everyone.
Finally, I finish up recursion with the base case. I know I am kind of doing the second and third steps backwards and ocaasionally, I even forget about the base case. Interestingly, sometimes the base case is latently wired in the code that doesn't require significant efforts to write it. E.g. when we want to sum up all the items in a nested list. The base case is automatically sum(a certain list).

Below is a interesting example (kind of tricky) I found in the tree structure codes:
counting the leaves in a tree: return sum(count(i) for i in t.children) + (0 if t.children else 1)
The simplest case is where the height of the tree is 0. This t.children == False, return 1
Whereas the recursion insight is more mathematically—count the leaves of each of its children.
The trickiest part i find is who the data is accumulated. When I was thinking about this question myself, I didn't think of the second part of the return statement. This elaborate design accumulates nothing if a tree has any children, but when it reaches the leaves, return 1. Then sum it up, thereby competing the code.

Recursion is indeed an artistic design and a product of wisdom.

Friday, February 21, 2014

week 5

Throwback to week 5. All I remember was how hard the lab is...

The function Danny led us through in class provide me with enough inspiration to finish A1. It's a great idea which i could not have though of myself.

I find the lingoes used to describe the relations among value, variable, memory address in 148 different from those in 108. But since computer science is not about the accuracy of language. I would probably let it go. Still, my understand is a variable contain value, which refers to an memory address as traced in visualizer.

About the scope in python, the key point to scope given in class is to follow the hierarchy system and apply the sequence. For example, in the example given in class, the most important part to understand is how do_global() failed to let print("After global assignment:", spam) print global spam, but instead, nonlocal spam. Since nonlocal already

But looking back, as i am reviewing everything and do the labs all over again, recursion begins to make sense to me. Practice makes perfect. Here are some typical recursion question, one of which i copy from slides, one i think of my self:
largest number in a nested list: the insight is to replace every nested list with the max. and the base case is where there's no nested case: max(l), so the code is return max([largest(i) if isinstance(i, list) else i for i in l)]), by creating a new simple, not nested list
the second example is to decide whether a number is in a nested list. The base case is where there's no nested list, simply n in l. The insight is to n is either in non nested list or inside nested list. Code: return any(n==i if not isinstance(i, list) else whether(n, i) for i in l)

Thursday, February 20, 2014

week 6

This week we begin with binary trees. Its supreme node is called the root. And especially for binary trees, each node can only have 2 children. In the list expression, recursion is obvious in that the tl[0] is an object/ value while the rest two component are tree lists themselves. As for the three methods to trace the tree, I find the first one is what we generally use, while i really don't know what the other two can be used in. Binary tree presents a systematic hierarchy of data structure.
In addiction, i come up with a different solution from class: e.g. in order [tl[0]] + (inorder(tl[1]) if isinstance(tl[1], list) else []) + (inorder(tl[2]) if isinsatnce(tl[2], list) else []). The ternary structure requires all three components. And this solution is just a thought, I later find out this would not work if the list is None.

The second part is the general tree implementation, where the trees can have more than two children. It follow the structure from binary tree except that it combines the branches of each nodes together in a list(from left to right). Clearly, the leaves are also presented as subtrees with children set to empty list in its __init__ method. I spent a lot of time trying to figure out how to write its methods in recursion and find that though it's relatively easy to trace and understand recursion; it's so much harder to write them down on my own. By looking at the answers, i gather some inspiration from then.
__contain__: I was initially kind of confused about that it doesn't have a base case. But when self.children is [], the for loop wouldn't executed, thus ending the recursion. Notably, the use of "any" is helpful in some cases
__repr__: I was trying this in the shell, and it turned out that i don't need to use recursion at all. Since i pass in a tree list, the __init__ method__ would turn it into the appropriate structure, and when we tried to "represent" it. Can't we just simply use "Tree()". format(self.value, self.children)?
leaf_count: this function also used something new that took me a while to understand. The idea I frist come up with is sum([leaf_count(i) ) for i in t.children), then i am stuck with the base case.
count: the base case is when self.children == [], return 1 e.g.sum([2**x for x in []]) return 0

Meanwhile, I am trying to come up with an ultimate process to write recursion function. The first step is coming up with the base case. And write code that recursion can run. But loopholes are everywhere.

Questions
1. if statement can exist without else, in the short-form list expression can we use ### if ## and leave out the else? when i tried this, it results in an error.
answer: [x+7 if x> 2 else x+2 for x in [2,3,4]] ternary fixed syntax

2. when we define our own class, isinstance method should also work on defined classes. But for the Tree class here, even though we require children to be None or another Treelist, isinstance can't tell the difference. AKA isinstance(Tree(2, [2,3]), Tree) evaluates to True. How can we fix the problem?