Searching Techniques -------------------- Developed by Hardeep Singh. Copyright: Hardeep Singh, 2001. EMail : seeingwithc@hotmail.com Website : seeingwithc.org All rights reserved. The code may not be used commercially without permission. The code does not come with any warranties, explicit or implied. The code cannot be distributed without this header. Problem: Given a list of distinct numbers (LIST), and another number (A), find the index of A in LIST if it is present or indicate that it is not present. Although we have narrated the problem for numbers, it can actually refer to any datatype that can be compared for equality and arranged in order, such as strings. Note: In the text below, lg(n) means log of n to base 2. Solution #1: The naive approach ------------------ The simplest way, of course, is to compare the given number to each of the numbers in the list. If a number in the list matches the given number A, we can return the index of that number. If we reach the end of the list, we can indicate that A does not exist in LIST. Here is an implementation of this simple algorithm: #include #define MAXELT 50 void main() { int i=-1,n=0,list[MAXELT],a; char t[10]; int linear_search(int list[],int n,int a); do { //read the list if (i!=-1) list[i++]=n; else i++; printf("\nEnter the numbers "); fflush(stdin); scanf("%[^\n]",t); if (sscanf(t,"%d",&n)<1) break; } while (1); printf("\nEnter the number to be searched for >"); //read the number scanf("%d",&a); int pos=linear_search(list,i,a); //do linear search if (pos==-1) //print the result printf("\nThe number does not exist in the list."); else printf("\nThe number exists at location %d.",pos+1); } int linear_search(int list[],int n,int a) { for (int i=0;i=n f1=f1+f2; //next fibonacci number f2=f1-f2; //store old f1 } f1=f1-f2; //find lower fibonacci numbers f2=f2-f1; //f1=fib(j-2), f2=fib(j-3) mid=n-f1+1; while (a!=list[mid]) //if not found if (mid<0 || a>list[mid]) { //look in lower half if (f1==1) return -1; mid=mid+f2; //decrease fibonacci numbers f1=f1-f2; f2=f2-f1; } else { //look in upper half if (f2==0) //if not found return -1 return -1; mid=mid-f2; //decrease fibonacci numbers t=f1-f2; //this time, decrease more f1=f2; //for smaller list f2=t; } return mid; } Whew, this one has the longest code so far. It is also O(lgn) algorithm. To do a comparison, we used a list of 10 numbers, and searched for each of the 10 numbers in the list once. Then, we searched for three un-present(?) numbers in the list, a total of 13 searches. We counted the total number of key comparisons using fibonacci_search and binary_search. The total for binary_search was 40, for fibonacci_search it was 41. Since this was a small scale example, binary search scored, in larger instances, it may be the other way around. ADDENDUM -------- In this part, I will express the linear and the binary search algorithms in 80386 assembly language. This, is towards the following goals: 1) For people who don't know assembly/machine languages, I want them to understand assembly and the way the programs that they have been writing runs on the machine, after compilation. I want the to specially notice the fact that what look like two distinct key comparisons in the mail loop of binary search can be realised with just one CoMPare instruction at the assembly level, therefore it is always counted as one. 2) For those who know a little assembly but work mainly in C/C++, I want to provide them with more practice in the native computer language. It always helps to know assembly! 3) For those who already know assembly, I want to provide them with ready made timesaver routines. Even while writing C/C++ software, these assembly routines can be used to speed things up, since all good C/C++ compilers allow facilities to write inline assembly code. Linear Search ------------- Precondition: CX holds the number of entries, SI has been initialised to first array element, D=0. AGAIN: LODSD ; The instruction reads the next number from memory ; into a CPU register(EAX) CMP EAX,EBX ; Compare registers EAX and EBX. EBX already holds ; the value to be found LOOPNE AGAIN ; Loop if EAX and EBX Not Equal and all entries have ; not been exhausted - to statement labeled AGAIN CMP EAX,EBX ; Out of the loop - Two things are possible: EAX and ; EBX were found equal, or all list entries exhausted ; So, compare to make sure JNE LABEL1 ; Jump if Not Equal to LABEL1 ... ; Code to announce the fact that the entry has been ; found at location marked by CX and exit LABEL1:... ; Code to announce that the number has not been ; found and exit. Binary Search -------------- AGAIN: MOV AL,FIRST ; Register AL <- Memory location containing FIRST CMP AL,LAST ; CoMPare AL (FIRST) with LAST JA NOTFOUND ; Jump if Above - to NOTFOUND since FIRST>LAST ADD AL,LAST ; ADD LAST to AL. Al now contains FIRST+LAST SHR AL,1 ; SHift Right AL once, this will divide it by 2 MOV MID,AL ; Send current value of AL to memory location MID MOVZX AX,AL ; AX <- AL. (This is called MOV Zero eXpanded) SHL AX,2 ; SH Left AX by 2. This multiplies it by 4. This is done because each array value is of $ bytes. ADD AX,OFFSET ARRAY ; Now AX contains the address of MID element MOV SI,AX ; SI <- AX LODSD ; Read value from memory location SI into EAX CMP EAX,N ; CoMPare EAX with N, the number to be found JE FOUND ; Jump if Equal to FOUND JL LESS ; Jump if Less to LESS (to lower list). Note that ; no comparison is done here, just a jump. Just one ; comparison was done in CMP statement above. The flags ; hold the result which is checked by the jump ; instruction. Jump is equivalent to a simple goto ; in high level language; it does not do any inherent ; comparison MOV AL,MID ; AL <- MID DEC AL ; AL <- AL-1 MOV LAST,AL ; LAST <- AL. These three instructions accomplish ; LAST=MID-1. JMP AGAIN ; Iterate LESS: MOV AL,MID ; AL <- MID INC AL ; AL <- AL+1 MOV FIRST,AL ; FIRST <- AL. These three instructions accomplish ; FIRST=MID+1 JMP AGAIN ; JuMP FOUND: ... ; Code to indicate "Found number" at location contained by SI and exit NOTFOUND: ... ; Code to indicate "Not found" That's all folks!