All Packages Class Hierarchy This Package Previous Next Index
Class Webcrawler.Crawler.LoadableNode
java.lang.Object
|
+----Webcrawler.Crawler.URLNode
|
+----Webcrawler.Crawler.LoadableNode
- public class LoadableNode
- extends URLNode
This class is derived from URLNode and contains additional info about a link
that cn be loaded (like FTP or HTML). An URL that links to something loadable
can be dead, malformed or recursive (=already downloaded before). The class
java.net.URLConnection has a few methods for finding out more about a link,
such as contentLength, date and contentType. Since a LoadableNode can be downloaded
from the net onto the local harddrive, every one of those nodes stores the filename
where it is stored locally.
-
contentLength
- file-length in bytes
-
contentType
- e.g: "text/html" or "image/gif"
-
date
- file-creation date
-
dead
-
-
expiration
- expiration date
-
interrupted
-
-
lastModified
- date when file was last modified
-
localFile
- fileName of locally stored file
-
malformed
-
-
normal
-
-
recursive
-
-
URLType
- see static finals above (default: normal)
-
LoadableNode()
-
-
LoadableNode(String)
-
-
LoadableNode(URL, String)
-
-
canBeLoaded(URLConnection)
- Says whether the URL of this node can be loaded/exists or not.
-
copy(LoadableNode)
-
copies all the URLConnection-info and the localFile-field.
-
getContentLength()
-
-
getContentType()
-
-
getDate()
-
-
getExpiration()
-
-
getLastModified()
-
-
getLocalFile()
-
-
getURLConnectionInfo()
- If you don't have an URLConnection open (yet), use this method.
-
getURLConnectionInfo(URLConnection)
- Sets the contentLength - lastModified fields of this node to
whatever can be retreived from the URLConnection uc.
-
getURLType()
-
-
setLocalFile(String)
- Controller often need to re-set the localfile attribute of a node.
-
setLocalFileInvalid()
- Sets the localFile field back to an empty String.
-
toString()
- In case this node has a malformed URL, the first word of the infoText is returned.
normal
public static final int normal
dead
public static final int dead
malformed
public static final int malformed
recursive
public static final int recursive
interrupted
public static final int interrupted
URLType
protected int URLType
- see static finals above (default: normal)
contentLength
protected int contentLength
- file-length in bytes
contentType
protected String contentType
- e.g: "text/html" or "image/gif"
date
protected long date
- file-creation date
expiration
protected long expiration
- expiration date
lastModified
protected long lastModified
- date when file was last modified
localFile
protected String localFile
- fileName of locally stored file
LoadableNode
public LoadableNode()
- See Also:
- URLNode
LoadableNode
public LoadableNode(String url) throws MalformedURLException
- See Also:
- URLNode
LoadableNode
public LoadableNode(URL context,
String spec) throws MalformedURLException
- See Also:
- URLNode
copy
public void copy(LoadableNode from)
- copies all the URLConnection-info and the localFile-field.
getURLType
public int getURLType()
- Returns:
- the URLType of this node (e.g. dead)
getContentLength
public int getContentLength()
- Returns:
- the size of the URLs content in bytes
- See Also:
- URLConnection
getContentType
public String getContentType()
- Returns:
- the content type
- See Also:
- URLConnection
getDate
public long getDate()
- Returns:
- the date when the file was created
- See Also:
- URLConnection
getExpiration
public long getExpiration()
- Returns:
- the date when the file expires
- See Also:
- URLConnection
getLastModified
public long getLastModified()
- Returns:
- the date when the file was last modified
- See Also:
- URLConnection
getLocalFile
public String getLocalFile()
- Returns:
- the filename of the locally stored file
setLocalFileInvalid
public void setLocalFileInvalid()
- Sets the localFile field back to an empty String. This method
can e.g. be used by a Controller after the localFiles were
deleted.
setLocalFile
public void setLocalFile(String lf)
- Controller often need to re-set the localfile attribute of a node.
toString
public String toString()
- In case this node has a malformed URL, the first word of the infoText is returned.
Otherwise URLNode.toString() is called;
- Returns:
- a String-representation of the URL of this node
- Overrides:
- toString in class URLNode
canBeLoaded
public boolean canBeLoaded(URLConnection uc)
- Says whether the URL of this node can be loaded/exists or not.
In case of an HTPP-url a HttpURLConnection is opened and the responseCode checked.
The response-code must be < 300 to be ok.
In case of a FILE-url or FTP-url a stream is opened and the IOException cought.
- Parameters:
- uc - Reuse an existing URLConnection!!
getURLConnectionInfo
public void getURLConnectionInfo(URLConnection uc)
- Sets the contentLength - lastModified fields of this node to
whatever can be retreived from the URLConnection uc.
Use this if you already have a URLConnection open.
getURLConnectionInfo
public URLConnection getURLConnectionInfo()
- If you don't have an URLConnection open (yet), use this method. It
opens a Connection and calls getURLConnectionInfo(connection).
- Returns:
- The opened URLConnection for reuse!
All Packages Class Hierarchy This Package Previous Next Index