Algorithms And It`s Types

if we talk about algorithm, In mathematics and computer science is a self-contained sequence of actions to be performed. Algorithms can perform calculation, data processing and automated reasoning tasks. The concept of algorithm has existed for centuries; however, a partial formalization of what would become the modern algorithm began with attempts to solve the Entscheidungsproblem (the “decision problem”) posed by David Hilbert in 1928.

In this article we discuss on various types of Algorithms.  Here is a short list of Algorithms we will discuss

  1. Simple recursive algorithms
  2. Backtracking algorithms
  3. Divide and conquer algorithms
  4. Dynamic programming algorithms
  5. Greedy algorithms
  6. Branch and bound algorithms
  7. Brute force algorithms
  8. Randomized algorithms

Simple recursive algorithms

This types of algorithm can solves the base cases directly. It Recurs with a simpler subproblem and does some extra work to convert the solution to the simpler subproblem into a solution to the given problem. In short A recursive algorithm is an algorithm which calls itself with “smaller (or simpler)” input values, and which obtains the result for the current input by applying simple operations to the returned value for the smaller (or simpler) input. We call these “simple” because several of the other algorithm types are inherently recursive

Some Examples of recursive algorithms are:

  • To count the number of elements in a list:
  1. If the list is empty, return zero; otherwise,
  2. Step past the first element, and count the remaining elements in the list
  3. Add one to the result
  • To test if a value occurs in a list:
  1. If the list is empty, return false; otherwise,
  2. If the first thing in the list is the given value, return true; otherwise
  3. Step past the first element, and test whether the value occurs in the remainder of the list

Backtracking algorithms

Backtracking algorithms are based on a depth-first recursive search. A backtracking algorithms used to Tests to see if a solution has been found,  and if so, return it otherwise false.  Now for each choice that can be made at this point, we perform following process

  • Make the choice
  • Recurse
  • if the recursion returns a solution, return it.
  • if no choices remain, return failure

Example of Backtracking

To color a map with no more than four colors:

  • color(Country n)
  • If all countries have been colored (n > number of countries) return success; otherwise,
  • For each color c of four colors,
    • If country n is not adjacent to a country that has been colored c
      • Color country n with color c
      • Recursively color country n+1
      • If successful, return success
  • Return failure (if loop exits)

Divide and Conquer

A divide and conquer algorithm consists of two parts:

  • Divide the problem into smaller subproblems of the same type, and solve these subproblems recursively
  • Combine the solutions to the subproblems into a solution to the original problem

Traditionally, an algorithm is only called divide and conquer if it contains two or more recursive calls

Following are some standard algorithms that are Divide and Conquer algorithms.

1) Binary Search is a searching algorithm. In each step, the algorithm compares the input element x with the value of the middle element in array. If the values match, return the index of middle. Otherwise, if x is less than the middle element, then the algorithm recurs for left side of middle element, else recurs for right side of middle element.

2) Quicksort is a sorting algorithm. The algorithm picks a pivot element, rearranges the array elements in such a way that all elements smaller than the picked pivot element move to left side of pivot, and all greater elements move to right side. Finally, the algorithm recursively sorts the subarrays on left and right of pivot element.

3) Merge Sort is also a sorting algorithm. The algorithm divides the array in two halves, recursively sorts them and finally merges the two sorted halves.

4) Closest Pair of Points The problem is to find the closest pair of points in a set of points in x-y plane. The problem can be solved in O(n^2) time by calculating distances of every pair of points and comparing the distances to find the minimum. The Divide and Conquer algorithm solves the problem in O(nLogn) time.

5) Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple method to multiply two matrices need 3 nested loops and is O(n^3). Strassen’s algorithm multiplies two matrices in O(n^2.8974) time.

6) Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common algorithm for FFT. It is a divide and conquer algorithm which works in O(nlogn) time.

7) Karatsuba algorithm for fast multiplication  it does multiplication of two n-digit numbers in at most 3 n^{\log_23}\approx 3 n^{1.585}single-digit multiplications in general (and exactly n^{\log_23} when n is a power of 2). It is therefore faster than the classical algorithm, which requires n2 single-digit products. If n = 210 = 1024, in particular, the exact counts are 310 = 59,049 and (210)2 = 1,048,576, respectively.

Some Examples:

Quick Sort:

  • Partition the array into two parts, and quicksort each of the parts
  • No additional work is required to combine the two sorted parts

Merge Sort:

  • Cut the array in half, and mergesort each half
  • Combine the two sorted arrays into a single sorted array by merging them

Dynamic programming algorithms

A dynamic programming algorithm remembers past results and uses them to find new results. Dynamic programming is generally used for optimization problems. This method  is for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions. A dynamic programming algorithm will examine the previously solved subproblems and will combine their solutions to give the best solution for the given problem.  This differs from Divide and Conquer, where subproblems generally need not overlap.

Example:

  • Multiple solutions exist, need to find the “best” one
    • Requires “optimal substructure” and “overlapping subproblems”
      • Optimal substructure: Optimal solution contains optimal solutions to subproblems
      • Overlapping subproblems: Solutions to subproblems can be stored and reused in a bottom-up fashion

Fibonacci numbers again

To find the nth Fibonacci number:

  • If n is zero or one, return one; otherwise,
    • Compute, or look up in a table, fibonacci(n-1) and fibonacci(n-2)
    • Find the sum of these two numbers
    • Store the result in a table and return it

Since finding the nth Fibonacci number involves finding all smaller Fibonacci numbers, the second recursive call has little work to do. The table may be preserved and used again later.


Greedy algorithms

A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.  An optimization problem is one in which you want to find, not just a solution, but the best solution. A “greedy algorithm” sometimes works well for optimization problems

A greedy algorithm works in phases: At each phase:

  • You take the best you can get right now, without regard for future consequences
  • You hope that by choosing a local optimum at each step, you will end up at a global optimum

Example:

Suppose you want to count out a certain amount of money, using the fewest possible bills and coins. So a greedy algorithm would do this would be

At each step, take the largest possible bill or coin that does not overshoot

  • To make $6.39, you can choose:
    • a $5 bill
    • a $1 bill, to make $6
    • a 25¢ coin, to make $6.25
    • A 10¢ coin, to make $6.35
    • four 1¢ coins, to make $6.39

For US money, the greedy algorithm always gives the optimum solution


Branch and bound algorithms

Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root.

Branch and bound algorithms are generally used for optimization problems

  • As the algorithm progresses, a tree of subproblems is formed
  • The original problem is considered the “root problem”
  • A method is used to construct an upper and lower bound for a given problem
  • At each node, apply the bounding methods
    • If the bounds match, it is deemed a feasible solution to that particular subproblem
    • If bounds do not match, partition the problem represented by that node, and make the two subproblems into children nodes
  • Continue, using the best known feasible solution to trim sections of the tree, until all nodes have been solved or trimmed

Examples:

Travelling salesman problem: A salesman has to visit each of n cities (at least) once each, and wants to minimize total distance travelled

  • Consider the root problem to be the problem of finding the shortest route through a set of cities visiting each city once
  • Split the node into two child problems:
    • Shortest route visiting city A first
    • Shortest route not visiting city A first
  • Continue subdividing similarly as the tree grows

The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.


Brute force algorithm

In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem’s statement. A brute force algorithm simply tries all possibilities until a satisfactory solution is found.

A brute-force algorithm to find the divisors of a natural number n would enumerate all integers from 1 to n, and check whether each of them divides n without remainder. A brute-force approach for the eight queens puzzle would examine all possible arrangements of 8 pieces on the 64-square chessboard, and, for each arrangement, check whether each (queen) piece can attack any other.

  • Such an algorithm can be
    • Optimizing: Find the best solution. This may require finding all solutions, or if a value for the best solution is known, it may stop when any best solution is found
      • Example: Finding the best path for a travelling salesman
    • Satisficing: Stop as soon as a solution is found that is good enough
      • Example: Finding a travelling salesman path that is within 10% of optimal

Randomized algorithms

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the “average case” over all possible choices of random bits. Formally, the algorithm’s performance will be a random variable determined by the random bits; thus either the running time, or the output (or both) are random variables.

  • A randomized algorithm uses a random number at least once during the computation to make a decision
    • Example: In Quicksort, using a random number to choose a pivot
    • Example: Trying to factor a large prime by choosing random numbers as possible divisors

That’s all for this post. I hope this will helps you.  If this post helped you to understand algorithms like a pro, considering sharing so others can benefit too.

Cheers!

Advertisements

5 Minute Guide to Clustering – Java Web Apps in Tomcat

What is Cluster ?

In J2EE Enterprises are choosing Java 2, Enterprise Edition (J2EE) to deliver their mission-critical applications over the Web. Within the J2EE framework, clusters provide mission-critical services to ensure minimal downtime and maximum scalability. A cluster is a group of application servers that transparently run your J2EE application as if it were a single entity. To scale, you should include additional machines within the cluster. To minimize downtime, make sure every component of the cluster is redundant.

In this article we will gain a foundational understanding of setting up clustering in Apache & Tomcat. yes you heard me right. For the purposes of the rest of this article, when I say “Apache” I mean the web server (like xampp), and when I say “Tomcat” I mean Tomcat.

 

nodes

There are pretty much two ways to set up basic clustering, which use two different Apache modules. The architecture for both, is the same. Apache sits in front of the Tomcat nodes and acts as a load balancer.

Traffic is passed between Apache and Tomcat(s) using the binary AJP 1.3 protocol. The two modules are modjk and modproxy.

  • mod_jk stands for “jakarta” the original project under which Tomcat was developed. It is the older way of setting this up, but still has some advantages.
  • modproxy is a newer and more generic way of setting this up. The rest of this guide will focus on modproxy, since it ships “out of the box” with newer versions of Apache.

You should be able to follow this guide by downloading Apache and Tomcat default distributions and following the steps.

Clustering Background:

You can cluster at the request or session level. Request level means that each request may go to a different node – this is the ideal since the traffic would be balanced across all nodes, and if a node goes down, the user has no idea. Unfortunately this requires session replication between all nodes, not just of HttpSession, but ANY session state. For the purposes of this article I’m going to describe Session level clustering, since it is simpler to set up, and works regardless of the dynamics of your application. ……. After all we only have 5 minutes!

Session level clustering means if your application is one that requires a login or other forms of session-state, and one or more your Tomcat nodes goes down, on their next request, the user will be asked to log in again, since they will hit a different node which does not have any stored session data for the user.

This is still an improvement on a non-clustered environment where, if your node goes down, you have no application at all! And we still get the benefits of load balancing across nodes, which allows us to scale our application out horizontally across many machines. Anyhow without further ado, let’s get into the how-to.

Setting Up The Nodes:

In most situations you would be deploying the nodes on physically separate machines, but in this example we will set them up on a single machine, but on different ports. This allows us to easily test this configuration. Nothing much changes for the physically separate set up – just the Hostnames of the nodes as you would expect. Oh and I’m working on Windows – but aside from the installation of Apache and Tomcat nothing is different between platforms since the configuration files are standard on all platforms.

  1. DownloadTomcat .ZIP distribution, e.g.
  2. We’ll use a folder to install all this stuff in. Let’s say it’s “C:\cluster” for the purposes of the article.
  3. Unzip the Tomcat distro twice, into two folders –C:\cluster\tomcat-node-1C:\cluster\tomcat-node-2
  4. Start up each of the nodes, using the bin/startup.bat / bin/startup.sh scripts. Ensure they start. If they don’t you may need to point Tomcat to the JDK installation on your machine.
  5. Open up the server.xml configuration onc:\cluster\tomcat-node-1\conf\server.xml

There are two places we need to (potentially) configure –

server_xml

The first line is the connector for the AJP protocol. The “port” attribute is the important part here. We will leave this one as is, but for our second (or subsequent) Tomcat nodes, we will need to change it to a different value. The second part is the “engine” element. The “jvmRoute” attribute has to be added – this configures the name of this node in the cluster. The “jvmRoute” must be unique across all your nodes. For our purposes we will use “node1” and “node2” for our two node cluster.

7. This step is optional, but for production configs, you may want to remove the HTTP connector for Tomcat – that’s one less port to secure, and you don’t need it for the cluster to operate. Comment out the following lines of the server.xml –

server_xml_7

8.Now repeat this forC:\cluster\tomcat-node-2\conf\server.xml Change the jvmRoute to “node2” and the AJP connector port to “8019”.

We’re done with Tomcat. Start each node up, and ensure it still works.

 

Setting Up The Apache Cluster

Now this an important part

  1. Download and installApache HTTP Server.Use the custom option to install it into C:\cluster\apache2.2
  2. Now open up c:\cluster\apache2.2\conf\httpd.conf in your favourite text editor.
  3.  Firstly, we need to uncomment the following lines (delete the ‘#’) –

httpd_conf

These enable the necessarymod_proxy modules in Apache.

4. Finally, go to the end of the file, and add the following:

<Proxy balancer://testcluster stickysession=JSESSIONID>
BalancerMember ajp://127.0.0.1:8009 min=10 max=100 route=node1 loadfactor=1
BalancerMember ajp://127.0.0.1:8019 min=20 max=200 route=node2 loadfactor=1
</Proxy>

ProxyPass /examples balancer://testcluster/examples

The above is the actual clustering configuration. The first section configures a load balancer across our two nodes. The loadfactor can be modified to send more traffic to one or the other node. i.e. how much load can this member handle compared to the others? This allows you to balance effectively if you have multiple servers which have different hardware profiles. Note also the “route” setting which must match the names of the “jvmRoutes” in the Tomcat server.xml for each node. This in conjunction with the “stickysession” setting is key for a Tomcat cluster, as this configures the session management. It tells mod_proxy to look for the node’s route in the given session cookie to determine which node that session is using. This allows all requests from a given client to go to the node which is holding the session state for the client. The ProxyPass line configures the actual URL from Apache to the load balanced cluster. You may want this to be “/” e.g. “ProxyPass /balancer://testcluster/” In our case we’re just configuring the Tomcat /examples application for our test.

10.Save it, and restart your Apache server.

Test Run

With your Apache server running you should be able to go toh ttp://localhost/examples

You should get a 503 error page- This is because both Tomcat nodes are down.

Start up node1 (c:\cluster\tomcat-node-1\bin\startup) and reload http://localhost/examples

You should see the examples application from the default Tomcat installation

page

Shut down node1, and then start up node2. Repeat the test. You should see the same page as above. We have transparently moved from node1 to node2 since node1 went down.Start both nodes up and your cluster is now working.

You’re done!

Thanks For Reading Folks.

Let me know its useful or not in comments.