Monday, February 24, 2014

Interview Question

October 24th, 2009 by lewis

SEE ALSO: Google PM interview classGoogle Software Engineer interview classGoogle Product Marketing interview class

Here’s a list of 140 Google interview questions. Many of our clients have interviewed and received Google job offers. Contact us for a free 15 minute interview analysis before your Google interview.
Link to Google interview questions for:
Google Interview Questions: Product Marketing Manager
  • Why do you want to join Google?
  • What do you know about Google’s product and technology?
  • If you are Product Manager for Google’s Adwords, how do you plan to market this?
  • What would you say during an AdWords or AdSense product seminar?
  • Who are Google’s competitors, and how does Google compete with them?
  • Have you ever used Google’s products? Gmail?
  • What’s a creative way of marketing Google’s brand name and product?
  • If you are the product marketing manager for Google’s Gmail product, how do you plan to market it so as to achieve 100 million customers in 6 months?
  • How much money you think Google makes daily from Gmail ads?
  • Name a piece of technology you’ve read about recently. Now tell me your own creative execution for an ad for that product.
  • Say an advertiser makes $0.10 every time someone clicks on their ad. Only 20% of people who visit the site click on their ad. How many people need to visit the site for the advertiser to make $20?
  • Estimate the number of students who are college seniors, attend four-year schools, and graduate with a job in the United States every year.
Google Interview Questions: Product Manager
  • How would you boost the GMail subscription base?
  • What is the most efficient way to sort a million integers?
  • How would you re-position Google’s offerings to counteract competitive threats from Microsoft?
  • How many golf balls can fit in a school bus?
  • You are shrunk to the height of a nickel and your mass is proportionally reduced so as to maintain your original density. You are then thrown into an empty glass blender. The blades will start moving in 60 seconds. What do you do?
  • How much should you charge to wash all the windows in Seattle?
  • How would you find out if a machine’s stack grows up or down in memory?
  • Explain a database in three sentences to your eight-year-old nephew.
  • How many times a day does a clock’s hands overlap?
  • You have to get from point A to point B. You don’t know if you can get there. What would you do?
  • Imagine you have a closet full of shirts. It’s very hard to find a shirt. So what can you do to organize your shirts for easy retrieval?
  • Every man in a village of 100 married couples has cheated on his wife. Every wife in the village instantly knows when a man other than her husband has cheated, but does not know when her own husband has. The village has a law that does not allow for adultery. Any wife who can prove that her husband is unfaithful must kill him that very day. The women of the village would never disobey this law. One day, the queen of the village visits and announces that at least one husband has been unfaithful. What happens?
  • In a country in which people only want boys, every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country?
  • If the probability of observing a car in 30 minutes on a highway is 0.95, what is the probability of observing a car in 10 minutes (assuming constant default probability)?
  • If you look at a clock and the time is 3:15, what is the angle between the hour and the minute hands? (The answer to this is not zero!)
  • Four people need to cross a rickety rope bridge to get back to their camp at night. Unfortunately, they only have one flashlight and it only has enough light left for seventeen minutes. The bridge is too dangerous to cross without a flashlight, and it’s only strong enough to support two people at any given time. Each of the campers walks at a different speed. One can cross the bridge in 1 minute, another in 2 minutes, the third in 5 minutes, and the slow poke takes 10 minutes to cross. How do the campers make it across in 17 minutes?
  • You are at a party with a friend and 10 people are present including you and the friend. your friend makes you a wager that for every person you find that has the same birthday as you, you get $1; for every person he finds that does not have the same birthday as you, he gets $2. would you accept the wager?
  • How many piano tuners are there in the entire world?
  • You have eight balls all of the same size. 7 of them weigh the same, and one of them weighs slightly more. How can you find the ball that is heavier by using a balance and only two weighings?
  • You have five pirates, ranked from 5 to 1 in descending order. The top pirate has the right to propose how 100 gold coins should be divided among them. But the others get to vote on his plan, and if fewer than half agree with him, he gets killed. How should he allocate the gold in order to maximize his share but live to enjoy it? (Hint: One pirate ends up with 98 percent of the gold.)
  • You are given 2 eggs. You have access to a 100-story building. Eggs can be very hard or very fragile means it may break if dropped from the first floor or may not even break if dropped from 100th floor. Both eggs are identical. You need to figure out the highest floor of a 100-story building an egg can be dropped without breaking. The question is how many drops you need to make. You are allowed to break 2 eggs in the process.
  • Describe a technical problem you had and how you solved it.
  • How would you design a simple search engine?
  • Design an evacuation plan for San Francisco.
  • There’s a latency problem in South Africa. Diagnose it.
  • What are three long term challenges facing Google?
  • Name three non-Google websites that you visit often and like. What do you like about the user interface and design? Choose one of the three sites and comment on what new feature or project you would work on. How would you design it?
  • If there is only one elevator in the building, how would you change the design? How about if there are only two elevators in the building?
  • How many vacuum’s are made per year in USA?
Google Interview Questions: Software Engineer
  • Why are manhole covers round?
  • What is the difference between a mutex and a semaphore? Which one would you use to protect access to an increment operation?
  • A man pushed his car to a hotel and lost his fortune. What happened?
  • Explain the significance of “dead beef”.
  • Write a C program which measures the the speed of a context switch on a UNIX/Linux system.
  • Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7.
  • Describe the algorithm for a depth-first graph traversal.
  • Design a class library for writing card games.
  • You need to check that your friend, Bob, has your correct phone number, but you cannot ask him directly. You must write a the question on a card which and give it to Eve who will take the card to Bob and return the answer to you. What must you write on the card, besides the question, to ensure Bob can encode the message so that Eve cannot read your phone number?
  • How are cookies passed in the HTTP protocol?
  • Design the SQL database tables for a car rental database.
  • Write a regular expression which matches a email address.
  • Write a function f(a, b) which takes two character string arguments and returns a string containing only the characters found in both strings in the order of a. Write a version which is order N-squared and one which is order N.
  • You are given a the source to a application which is crashing when run. After running it 10 times in a debugger, you find it never crashes in the same place. The application is single threaded, and uses only the C standard library. What programming errors could be causing this crash? How would you test each one?
  • Explain how congestion control works in the TCP protocol.
  • In Java, what is the difference between final, finally, and finalize?
  • What is multithreaded programming? What is a deadlock?
  • Write a function (with helper functions if needed) called to Excel that takes an excel column value (A,B,C,D…AA,AB,AC,… AAA..) and returns a corresponding integer value (A=1,B=2,… AA=26..).
  • You have a stream of infinite queries (ie: real time Google search queries that people are entering). Describe how you would go about finding a good estimate of 1000 samples from this never ending set of data and then write code for it.
  • Tree search algorithms. Write BFS and DFS code, explain run time and space requirements. Modify the code to handle trees with weighted edges and loops with BFS and DFS, make the code print out path to goal state.
  • You are given a list of numbers. When you reach the end of the list you will come back to the beginning of the list (a circular list). Write the most efficient algorithm to find the minimum # in this list. Find any given # in the list. The numbers in the list are always increasing but you don’t know where the circular list begins, ie: 38, 40, 55, 89, 6, 13, 20, 23, 36.
  • Describe the data structure that is used to manage memory. (stack)
  • What’s the difference between local and global variables?
  • If you have 1 million integers, how would you sort them efficiently? (modify a specific sorting algorithm to solve this)
  • In Java, what is the difference between static, final, and const. (if you don’t know Java they will ask something similar for C or C++).
  • Talk about your class projects or work projects (pick something easy)… then describe how you could make them more efficient (in terms of algorithms).
  • Suppose you have an NxN matrix of positive and negative integers. Write some code that finds the sub-matrix with the maximum sum of its elements.
  • Write some code to reverse a string.
  • Implement division (without using the divide operator, obviously).
  • Write some code to find all permutations of the letters in a particular string.
  • What method would you use to look up a word in a dictionary?
  • Imagine you have a closet full of shirts. It’s very hard to find a shirt. So what can you do to organize your shirts for easy retrieval?
  • You have eight balls all of the same size. 7 of them weigh the same, and one of them weighs slightly more. How can you fine the ball that is heavier by using a balance and only two weighings?
  • What is the C-language command for opening a connection with a foreign host over the internet?
  • Design and describe a system/application that will most efficiently produce a report of the top 1 million Google search requests. These are the particulars: 1) You are given 12 servers to work with. They are all dual-processor machines with 4Gb of RAM, 4x400GB hard drives and networked together.(Basically, nothing more than high-end PC’s) 2) The log data has already been cleaned for you. It consists of 100 Billion log lines, broken down into 12 320 GB files of 40-byte search terms per line. 3) You can use only custom written applications or available free open-source software.
  • There is an array A[N] of N numbers. You have to compose an array Output[N] such that Output[i] will be equal to multiplication of all the elements of A[N] except A[i]. For example Output[0] will be multiplication of A[1] to A[N-1] and Output[1] will be multiplication of A[0] and from A[2] to A[N-1]. Solve it without division operator and in O(n).
  • There is a linked list of numbers of length N. N is very large and you don’t know N. You have to write a function that will return k random numbers from the list. Numbers should be completely random. Hint: 1. Use random function rand() (returns a number between 0 and 1) and irand() (return either 0 or 1) 2. It should be done in O(n).
  • Find or determine non existence of a number in a sorted list of N numbers where the numbers range over M, M>> N and N large enough to span multiple disks. Algorithm to beat O(log n) bonus points for constant time algorithm.
  • You are given a game of Tic Tac Toe. You have to write a function in which you pass the whole game and name of a player. The function will return whether the player has won the game or not. First you to decide which data structure you will use for the game. You need to tell the algorithm first and then need to write the code. Note: Some position may be blank in the game। So your data structure should consider this condition also.
  • You are given an array [a1 To an] and we have to construct another array [b1 To bn] where bi = a1*a2*…*an/ai. you are allowed to use only constant space and the time complexity is O(n). No divisions are allowed.
  • How do you put a Binary Search Tree in an array in a efficient manner. Hint :: If the node is stored at the ith position and its children are at 2i and 2i+1(I mean level order wise)Its not the most efficient way.
  • How do you find out the fifth maximum element in an Binary Search Tree in efficient manner. Note: You should not use use any extra space. i.e sorting Binary Search Tree and storing the results in an array and listing out the fifth element.
  • Given a Data Structure having first n integers and next n chars. A = i1 i2 i3 … iN c1 c2 c3 … cN.Write an in-place algorithm to rearrange the elements of the array ass A = i1 c1 i2 c2 … in cn
  • Given two sequences of items, find the items whose absolute number increases or decreases the most when comparing one sequence with the other by reading the sequence only once.
  • Given That One of the strings is very very long , and the other one could be of various sizes. Windowing will result in O(N+M) solution but could it be better? May be NlogM or even better?
  • How many lines can be drawn in a 2D plane such that they are equidistant from 3 non-collinear points?
  • Let’s say you have to construct Google maps from scratch and guide a person standing on Gateway of India (Mumbai) to India Gate(Delhi). How do you do the same?
  • Given that you have one string of length N and M small strings of length L. How do you efficiently find the occurrence of each small string in the larger one?
  • Given a binary tree, programmatically you need to prove it is a binary search tree.
  • You are given a small sorted list of numbers, and a very very long sorted list of numbers – so long that it had to be put on a disk in different blocks. How would you find those short list numbers in the bigger one?
  • Suppose you have given N companies, and we want to eventually merge them into one big company. How many ways are theres to merge?
  • Given a file of 4 billion 32-bit integers, how to find one that appears at least twice?
  • Write a program for displaying the ten most frequent words in a file such that your program should be efficient in all complexity measures.
  • Design a stack. We want to push, pop, and also, retrieve the minimum element in constant time.
  • Given a set of coin denominators, find the minimum number of coins to give a certain amount of change.
  • Given an array, i) find the longest continuous increasing subsequence. ii) find the longest increasing subsequence.
  • Suppose we have N companies, and we want to eventually merge them into one big company. How many ways are there to merge?
  • Write a function to find the middle node of a single link list.
  • Given two binary trees, write a compare function to check if they are equal or not. Being equal means that they have the same value and same structure.
  • Implement put/get methods of a fixed size cache with LRU replacement algorithm.
  • You are given with three sorted arrays ( in ascending order), you are required to find a triplet ( one element from each array) such that distance is minimum.
  • Distance is defined like this : If a[i], b[j] and c[k] are three elements then distance=max(abs(a[i]-b[j]),abs(a[i]-c[k]),abs(b[j]-c[k]))” Please give a solution in O(n) time complexity
  • How does C++ deal with constructors and deconstructors of a class and its child class?
  • Write a function that flips the bits inside a byte (either in C++ or Java). Write an algorithm that take a list of n words, and an integer m, and retrieves the mth most frequent word in that list.
  • What’s 2 to the power of 64?
  • Given that you have one string of length N and M small strings of length L. How do you efficiently find the occurrence of each small string in the larger one?
  • How do you find out the fifth maximum element in an Binary Search Tree in efficient manner.
  • Suppose we have N companies, and we want to eventually merge them into one big company. How many ways are there to merge?
  • There is linked list of millions of node and you do not know the length of it. Write a function which will return a random number from the list.
  • You need to check that your friend, Bob, has your correct phone number, but you cannot ask him directly. You must write a the question on a card which and give it to Eve who will take the card to Bob and return the answer to you. What must you write on the card, besides the question, to ensure Bob can encode the message so that Eve cannot read your phone number?
  • How long it would take to sort 1 trillion numbers? Come up with a good estimate.
  • Order the functions in order of their asymptotic performance: 1) 2^n 2) n^100 3) n! 4) n^n
  • There are some data represented by(x,y,z). Now we want to find the Kth least data. We say (x1, y1, z1) > (x2, y2, z2) when value(x1, y1, z1) > value(x2, y2, z2) where value(x,y,z) = (2^x)*(3^y)*(5^z). Now we can not get it by calculating value(x,y,z) or through other indirect calculations as lg(value(x,y,z)). How to solve it?
  • How many degrees are there in the angle between the hour and minute hands of a clock when the time is a quarter past three?
  • Given an array whose elements are sorted, return the index of a the first occurrence of a specific integer. Do this in sub-linear time. I.e. do not just go through each element searching for that element.
  • Given two linked lists, return the intersection of the two lists: i.e. return a list containing only the elements that occur in both of the input lists.
  • What’s the difference between a hashtable and a hashmap?
  • If a person dials a sequence of numbers on the telephone, what possible words/strings can be formed from the letters associated with those numbers?
  • How would you reverse the image on an n by n matrix where each pixel is represented by a bit?
  • Create a fast cached storage mechanism that, given a limitation on the amount of cache memory, will ensure that only the least recently used items are discarded when the cache memory is reached when inserting a new item. It supports 2 functions: String get(T t) and void put(String k, T t).
  • Create a cost model that allows Google to make purchasing decisions on to compare the cost of purchasing more RAM memory for their servers vs. buying more disk space.
  • Design an algorithm to play a game of Frogger and then code the solution. The object of the game is to direct a frog to avoid cars while crossing a busy road. You may represent a road lane via an array. Generalize the solution for an N-lane road.
  • What sort would you use if you had a large data set on disk and a small amount of ram to work with?
  • What sort would you use if you required tight max time bounds and wanted highly regular performance.
  • How would you store 1 million phone numbers?
  • Design a 2D dungeon crawling game. It must allow for various items in the maze – walls, objects, and computer-controlled characters. (The focus was on the class structures, and how to optimize the experience for the user as s/he travels through the dungeon.)
  • What is the size of the C structure below on a 32-bit system? On a 64-bit?
struct foo {
char a;
char* b;
};

Google Interview: Software Engineer in Test
  • Efficiently implement 3 stacks in a single array.
  • Given an array of integers which is circularly sorted, how do you find a given integer.
  • Write a program to find depth of binary search tree without using recursion.
  • Find the maximum rectangle (in terms of area) under a histogram in linear time.
  • Most phones now have full keyboards. Before there there three letters mapped to a number button. Describe how you would go about implementing spelling and word suggestions as people type.
  • Describe recursive mergesort and its runtime. Write an iterative version in C++/Java/Python.
  • How would you determine if someone has won a game of tic-tac-toe on a board of any size?
  • Given an array of numbers, replace each number with the product of all the numbers in the array except the number itself *without* using division.
  • Create a cache with fast look up that only stores the N most recently accessed items.
  • How to design a search engine? If each document contains a set of keywords, and is associated with a numeric attribute, how to build indices?
  • Given two files that has list of words (one per line), write a program to show the intersection.
  • What kind of data structure would you use to index annagrams of words? e.g. if there exists the word “top” in the database, the query for “pot” should list that.
Google Interview: Quantitative Compensation Analyst
  • What is the yearly standard deviation of a stock given the monthly standard deviation?
  • How many resumes does Google receive each year for software engineering?
  • Anywhere in the world, where would you open up a new Google office and how would you figure out compensation for all the employees at this new office?
  • What is the probability of breaking a stick into 3 pieces and forming a triangle?
Google Interview: Engineering Manager
  • You’re the captain of a pirate ship, and your crew gets to vote on how the gold is divided up. If fewer than half of the pirates agree with you, you die. How do you recommend apportioning the gold in such a way that you get a good share of the booty, but still survive?
Google Interview: AdWords Associate
  • How would you work with an advertiser who was not seeing the benefits of the AdWords relationship due to poor conversions?
  • How would you deal with an angry or frustrated advertisers on the phone?
Sources
http://news.ycombinator.com/item?id=266663
http://tihomir.org/crazy-questions-at-google-job-interview/
http://www.drizzle.com/~jpaint/google.html
http://www.gamedev.net/community/forums/topic.asp?topic_id=299692
http://careers.cse.sc.edu/googleinterview
http://job-interview.blogspot.com/2005/02/google-interview-product-marketing.html
http://www.theregister.co.uk/2007/01/05/google_interview_tales/
http://money.cnn.com/2007/08/29/technology/brain_teasers.biz2/index.htm
http://blogs.lessthandot.com/index.php/ITProfessionals/EthicsIT/google-interview-questions
http://placementsindia.blogspot.com/2007/09/google-top-interview-puzzles.html
http://linkmingle.com/user/interview_questions/google_interview_questions
http://discuss.joelonsoftware.com/default.asp?interview.11.626758.33
http://mindcipher.com/puzzle/78-clock-works
http://www.glassdoor.com
http://bluepixel.ca/blog/?p=69
http://www.businessinsider.com/my-nightmare-interviews-with-google-2009-11

Friday, February 14, 2014

overflow recgonition 判断溢出

C++ 计算时判断溢出

方法 一
===========================================================
short, int, long 类型
a). 相加的时候,例如 a + b, 判断 INT_MAX - a 和 b的大小
b). 相减的时候,例如 a - b, 可以转换为a + (-b),  然后类似 上一条来判断。

要注意的是,b变为-b的时候,有可能发生溢出,需要单独判断。
如果b不是INT_MIN,可以转换后是直接按a) 来做
如果b是INT_MIN,要把 -b分拆为 -(b+1) 和1,因为是特例,我们可以直接改写为 a + -(b+1) + 1, 因为后面两个直接溢出,可以调整顺序,变为a + 1 + -(b+1)

以上这个办法,有缺点
第一、处理起来啰嗦
第二,INT_MAX-a本身就有可能溢出

方法 二
===========================================================
转换为上下限更高的数据类型,然后来判断
比如 int 类型的 a+b,我们可以转换为 (long)a + (long)b, 然后和INT_MAX 或者 INT_MIN比较
如果是long,可以转化为long long
如果是long long, 可以转化为double
double以后,就有内置函数类比较了 
http://www.cplusplus.com/reference/cmath/isfinite/
http://www.cplusplus.com/reference/cmath/isinf/

方法 二
===========================================================
对于 short, int, long, long long 类型的数据a 和 b 的加减法,判断溢出通用的做法可以是
先判断是否同号,
如果相加异号 或者 相减同号,无溢出
如果相加同号 或者 相减异号,换为无符号数,unsigned的,先运算c = a + b
然后通过比较a, b 和c的符号来判断是否溢出。



而对于float, double, long double 这些,会有内置的函数isfinite,isinf来比较, 参见
http://www.cplusplus.com/reference/cmath/isfinite/
http://www.cplusplus.com/reference/cmath/isinf/

/* isinf example */
#include <stdio.h>      /* printf */
#include <math.h>       /* isinf, sqrt */

int main()
{
  printf ("isinf(0.0)       : %d\n",isinf(0.0));
  printf ("isinf(1.0/0.0)   : %d\n",isinf(1.0/0.0));
  printf ("isinf(-1.0/0.0)  : %d\n",isinf(-1.0/0.0));
  printf ("isinf(sqrt(-1.0)): %d\n",isinf(sqrt(-1.0)));
  return 0;
}

Thursday, February 13, 2014

k sum problem (k 个数的求和问题)

问题陈述:

在一个数组,从中找出k个数(每个数不能重复取。数组中同一个值有多个,可以取多个),使得和为零。找出所有这样的组合,要求没有重复项(只要值不同即可,不要求在原数组中的index不同)

解法:

2 sum 用hash table做,可以时间O(n),空间O(n),
2 sum 如果用sort以后,在前后扫描,可以时间O(nlogn + n) = O(nlogn),空间O(1)
2 sum 用hash table做的好处是快,但是等于是利用了不用排序的特点。排序的办法,在高维度(也就是k sum问题,k>2)的时候,nlogn就不是主要的时间消耗成分,也就更适合2sum的sort后双指针扫描查找的办法。

那么,对于k sum, k>2的,如果用sort的话, 可以 对 n-2的数做嵌套循环,因为已经sort过了,最后剩下的两维用2 sum的第二个办法, 时间是O(nlogn + n^(k-2) * n) = O(n^(n-1)),空间O(1)。 但是这样跟纯嵌套循环没有什么区别,只是最后一层少了一个因子n。有什么办法能优化?
就是说,对于 k sum (k>2) 问题 (一个size为n的array, 查找k个数的一个tuple,满足总和sum为0), 有没有时间复杂度在O(n^(k-2))的办法?

之前常规的一层一层剥离,n的次数是递增的。只有在最后一层,还有两个维度的时候,时间开销上减少一个n的因子,但是这样时间开销还是太多

我们可以通过对问题分解来解决
举个例子
...-5,-4,-3,-2,-1, 0,1, 2, 3, 4, 5.... 要找 4 sum = 0
那么先分解
4 分成 2 sum + 2 sum 来解决,但是这里的子问题2 sum没有sum=0的要求,是保留任何中间值。只有当子问题的2 sum解决以后,回归原问题的时候,我们才又回归原始的2 sum问题,这时候sum=0
子问题,空间和时间消耗,都是O(n^2)
回归大问题,时间消耗,是O(n^2)

假设k sum中  k = 2^m, 那么一共有m层,会有m次分解
分解到最底层,时间空间消耗 从 原始O(n)变为新的O(n^2)
分解到次底层,时间空间消耗 从 O(n^2)变为新的O((n^2)^2)
...
到达最顶层,时间空间消耗就都变成了O(n^(2*m)) = O(n^(2logk))

和之前的方法O(n^(k-1))相比,O(n^(2logk))的时间是少了很多,但是空间消耗却很大。
因为子问题无法确定把哪一个中间结果留下,那么就需要把子问题的结果全部返回,到最后,空间消耗就很大了。整体效果算是空间换时间吧。

通过 问题的分解 + hashtable的运用,能明显减少时间消耗, 但是空间消耗变大是个问题。比如说,如果有10^6的int类型数组,我如果用这个hashtable的办法,就要有10^12的pair,这就有10T以上的空间消耗。

问题的分解是个很好的思路,但是中间值得保留迫使空间消耗增大,这和用不用hashtable倒没有很大关系,只是说,如果不用hashtable,时间消耗会更大。





另外,还有一些题目的变形,比如如果要求所有组合,满足
sum < k,
sum = k,
sum > k,
或者是 closest to k
遇到这些变形的时候,hashtable的做法就显得乏力了,但是嵌套循环的方式却仍是可行的。尤其是对closest to k这种非确定性的要求。

Friday, February 7, 2014

How to write resume

Link: http://www.mitbbs.com/article/JobHunting/32620869_0.html
Author: skydoor

发信人: skydoor (海阔天空), 信区: JobHunting
标  题: Amazon内推不少简历后的分享
发信站: BBS 未名空间站 (Thu Feb  6 18:49:32 2014, 美东)


最近收到不少朋友的简历,我都一一帮忙内推Amazon了,看了不少简历之后,写点东西
分享一下,不少朋友我在submit之前,都给了修改意见,我不负责招人,我说的不一定
对,大家可以探讨。 

我的帖子不代表Amazon观点和要求,只是我个人看法。

基本思想是,你的简历反映了你是否用心,人家很容易看清楚你是否用心,如何从一大
堆简历里面脱颖而出,第一,不能有硬伤;第二,要言之有物;第三,不要confuse别人

首先,你的简历的名字不要太general。不少朋友命名,resume.pdf,resume.doc,我
觉得,你可以在简历的名字上传递更多地信息,比如,resume_John_Smith.pdf, 
resume_John_Smith_Amazon.pdf, 是不是更好一些? 这样人家一看就觉得是有备而来
的,也方便人家在一大堆简历里面迅速找到你,不然人家还要点开看是谁?

第二,表述要一致。特别是时间,有的朋友简历里面,  Sep 2013, Fall 2013, 
January 2013, 08/2013 或者直接2012-2013没有月份。我的point是不管哪个方式,你
一定要一致。要么全部简写,要么全部整个单词,要么都是数字,总之要统一。

第三,不是好东西不要写。比如GPA, 如果你GPA 3.5以下, 其实就不用写了,可以写
类似5/100之类的;比如skills,你如果不太懂,千万不要写什么 basic knowldge of 
XXXX。总之就是,没人指望你很牛,但是,你不牛的部分,就不需要强调了。

第四,如果你申请马工,一定要强调skill的关键词。 个人建议,你描述的project的
时候,尽量多写类似,“...in Java” “...in Java” “...in Python” “...in C
++” 之类。 

第五,如果你很多paper的,申请马工,没有必要让publication占据很多位置,你可以
不把作者写全,可以不写具体杂志哪些页,也可以只写“selected publictions”。总
之就是不要喧宾夺主。

第六,不要confuse别人,你最好把你的简历拿去一个非同学非同一个系的朋友看看,
看是否有哪些地方对方咋一看很不知所以的地方。

第七,没必要限定在一页里面,一页半也是OK的


细节决定成败,你从一大堆人里面脱颖而出,绝对是大量的正确的细节决定的结果。

Monday, February 3, 2014

Reservoir Sampling Proof


array R[k];    // result
integer i, j;

// fill the reservoir array
for each i in 1 to k do
    R[i] := S[i]
done;

// replace elements with gradually decreasing probability
for each i in k+1 to length(S) do
    j := random(1, i);   // important: inclusive range
    if j <= k then
        R[j] := S[i]
    fi
done

 一开始有k个
每个数字被选进来的概率都是1

第k+1个,要放进去,那么前面每个被换出去的概率是 1/(k+1), 没被换出去的概率是k/(k+1)
有k-1个数字,是留下来了,留下来的概率是k/(k+1)
有1个数字,换出去了,新进来的,被选进来的概率是 k/(k+1)
所有数字留下来的概率都是k/(k+1)

然后,处理第k+2个
第k+2个,要放进去,那么前面每个被换出去的概率是 1/(k+2), 没被换出去的概率是(k+1)/(k+2),
有k-1个数字, 是留下来了, (k+1)/(k+2), 再加上上一次的概率,留下来的总概率是k/(k+1)  *   (k+1)/(k+2) = k/(k+2)
有1个数字,换出去了,那么新进来的数字被选进来的概率是 k/(k+2)
所有数字留下来的概率都是k/(k+2)


假设当前处理了n个样本,选出了k个数,并且满足条件
k个选出的样本,每个样本在n个样本中被抽出的概率都是k/n
那么处理第n+1个样本的时候
1). 第n+1个样本被选进k的概率是 k/(n+1),即,被换进的样本的概率是 k/(n+1)
2). 对于前k个样本,留下的概率是 n/(n+1),与之前的概率 k/n,做条件概率相乘,截止到目前该样本仍然留在k个样本中的总概率是 n/(n+1) * k/n = k/(n+1)
综合1) 2), k个样本每个留下的概率都是 k/(n+1)

以此类推,对于每个状态,即处理过的样本总量为n的时候,所有留下来的样本的全局总概率都是k/n 

Farthest Ascending Pair

Problem :
Given a sequence of integers A[n], find a pair of integers A[x]<A[y], such that y-x >= j-i and A[i]<A[j]

Solution:
In courtesy of JConstantine

 
class Solution
{
    public:
    std::vector<int> findLongestAscendingPair(const std::vector<int> &v) {
        std::vector<int> ret(2,0);
        int n = v.size();
        if(n == 0) return ret;
        std::vector<int> left(n,0);
        std::vector<int> right(n,0);
        left[0] = v[0];
        right[n-1] = v[n-1];
        for(int i=1; i<n; i++){
            left[i] = std::min(v[i], left[i-1]);
        }
        
        for(int i=n-2; i>=0; i--){
            right[i] = std::max(v[i], right[i+1]);
        }
        
        int i = 0, j = 0, distance = 0;
        while(i < n && j < n)
        {
            if(left[i] < right[j])
            {
                distance = std::max(j-i, distance);
                ret[0] = i;
                ret[1] = j;
                j = j + 1;
            }
            else
            {
                i = i+ 1;
            }
        }
        return ret;
    }
};

Sunday, February 2, 2014

longest increasing sequence


 
#include < vector >
using namespace std;

/* Finds longest strictly increasing subsequence. O(n log k) algorithm. */
void find_lis(vector < int > & a, vector < int > & b)
{
    vector < int > p(a.size());
    int u, v;
    
    if(a.empty()) return;
    
    b.push_back(0);
    
    for(size_t i = 1; i < a.size(); i++)
    {
        // If next element a[i] is greater than last element of current longest subsequence a[b.back()], just push it at back of "b" and continue
        if(a[b.back()] < a[i])
        {
            p[i] = b.back();
            b.push_back(i);
            continue;
        }
        
        // Binary search to find the smallest element referenced by b which is just bigger than a[i]
        // Note : Binary search is performed on b (and not a). Size of b is always <=k and hence contributes O(log k) to complexity.
        for(u = 0, v = b.size() - 1; u < v;)
        {
            int c = (u + v) / 2;
            if(a[b[c]] < a[i]) u = c + 1;
            else v = c;
        }
        
        // Update b if new value is smaller then previously referenced value
        if(a[i] < a[b[u]])
        {
            if(u > 0) p[i] = b[u - 1];
            b[u] = i;
        }
    }
    
    for(u = b.size(), v = b.back(); u--; v = p[v]) b[u] = v;
}

/* Example of usage: */
#include < cstdio >
int main()
{
    int a[] = {
        1, 9, 3, 8, 11, 4, 5, 6, 4, 19, 7, 1, 7
    };
    vector < int > seq(a, a + sizeof(a) / sizeof(a[0])); // seq : Input Vector
    vector < int > lis; // lis : Vector containing indexes of longest subsequence
    find_lis(seq, lis);
    
    //Printing actual output
    for(size_t i = 0; i < lis.size(); i++)
    printf("%d ", seq[lis[i]]);
    printf("\n");
    
    return 0;
}