All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class Webcrawler.Crawler.Crawler

java.lang.Object
   |
   +----java.util.Observable
           |
           +----Webcrawler.Crawler.Crawler

public class Crawler
extends Observable
implements Observer
The Crawler represents the model in the MVC-concept. The model consists of a URLTree object which organises all the nodes (the REAL model), of a todo- and a done-Pool and the Readers which download the files from the net. The Crawler communicates with one Controller through method-calls, and the attached Visualizers through the observabel-interface. The Crawler itself uses the observer/observable-system for synchronizing the todo-, done-pools and readers. For a more detailed description of how the Crawler works exactly please see the CrawlerDetails.HTML file.

See Also:
ControllerInterface, URLTree, FIFOQueue

Variable Index

 o controller
 o gotbackParsers
 o gotbackReaders
 o maxThreadNum
 o parsers
 o readers
 o sentoutParsers
 o sentoutReaders
 o todoParsers
 o todoReaders
 o tree

Constructor Index

 o Crawler(ControllerInterface)
Creates a new Crawler - which creates the 2 pools and the readers.

Method Index

 o checkForCrawlerDone()
 o checkUnevenWorkload()
Checks if one of the 2 pools has more nodes waiting than the other.
 o getParsersCounter()
 o getParsingNodes()
 o getReadersCounter()
 o getReadingNodes()
 o getTodoParsersElements()
 o getTodoReadersElements()
 o nodeIsDone(URLNode)
 o notifyVisualizers(int, Object)
 o processParsersMessage(Observable, ParsersMessage)
 o processReadersMessage(Observable, ReadersMessage)
 o sendNodeToReaders(URLNode)
 o setMaxThreadNum(int)
Called by the Controller to set the maximum number of Reader/Parser-threads
 o start(String)
Called by the Controller to start the Crawler
 o stop()
Called by the Controller to stop the Crawler
 o update(Observable, Object)
Called by either the Parsers or the Readers when a node is done.

Variables

 o controller
 private ControllerInterface controller
 o tree
 private URLTree tree
 o todoReaders
 private FIFOQueue todoReaders
 o todoParsers
 private FIFOQueue todoParsers
 o readers
 private Readers readers
 o parsers
 private Parsers parsers
 o maxThreadNum
 private int maxThreadNum
 o sentoutReaders
 private int sentoutReaders
 o gotbackReaders
 private int gotbackReaders
 o sentoutParsers
 private int sentoutParsers
 o gotbackParsers
 private int gotbackParsers

Constructors

 o Crawler
 public Crawler(ControllerInterface controller)
Creates a new Crawler - which creates the 2 pools and the readers. The Crawler is an observer of the done-Pool. The Readers-object connects itself to the todo- and the done-Pool as an observer. The Crawler needs a Controller for operation. That Controller tells the Crawler where and when to start, if a link should be loaded, and other things.

See Also:
ControllerInterface

Methods

 o setMaxThreadNum
 public void setMaxThreadNum(int maxThreadNum)
Called by the Controller to set the maximum number of Reader/Parser-threads

 o start
 public void start(String startURL) throws MalformedURLException
Called by the Controller to start the Crawler

 o stop
 public void stop()
Called by the Controller to stop the Crawler

 o sendNodeToReaders
 private void sendNodeToReaders(URLNode un)
 o update
 public synchronized void update(Observable observable,
                                 Object message)
Called by either the Parsers or the Readers when a node is done.

 o notifyVisualizers
 private synchronized void notifyVisualizers(int vmtype,
                                             Object node)
 o processReadersMessage
 private void processReadersMessage(Observable readers,
                                    ReadersMessage m)
 o processParsersMessage
 private void processParsersMessage(Observable parsers,
                                    ParsersMessage m)
 o nodeIsDone
 private void nodeIsDone(URLNode n)
 o checkForCrawlerDone
 private void checkForCrawlerDone()
 o checkUnevenWorkload
 private void checkUnevenWorkload()
Checks if one of the 2 pools has more nodes waiting than the other. The Readers/Parsers that has significantly more to do gets more threads than the other one.

 o getTodoReadersElements
 public Enumeration getTodoReadersElements()
 o getTodoParsersElements
 public Enumeration getTodoParsersElements()
 o getReadingNodes
 public Vector getReadingNodes()
 o getReadersCounter
 public int getReadersCounter()
 o getParsingNodes
 public Vector getParsingNodes()
 o getParsersCounter
 public int getParsersCounter()

All Packages  Class Hierarchy  This Package  Previous  Next  Index