The Artima Developer Community
Sponsored Link

Java Answers Forum
Help on reading html files

4 replies on 1 page. Most recent reply: Sep 16, 2003 11:09 PM by mausam

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 4 replies on 1 page
andy

Posts: 1
Nickname: andyl
Registered: Sep, 2003

Help on reading html files Posted: Sep 15, 2003 4:56 PM
Reply to this message Reply
Advertisement
Hi I am having trouble with Java I/O. The project is to read a directory of HTML files and then make a copy of all the files in another directory.

What is generally the best way to read a whole html file so you catch everything until the end of file? Does Bufferedreader's readLine() work? I've tried reading the html files line by line then using PrintWriter to print line by line onto another file in another directory. This should have resulted in an identical copy but it doesn't. Sometimes the output even terminates in the middle of the file. Any reason why?


Senthoorkumaran Punniamoorthy

Posts: 335
Nickname: senthoor
Registered: Mar, 2002

Re: Help on reading html files Posted: Sep 15, 2003 7:23 PM
Reply to this message Reply
Can you post your existing code and the html file for which it is failing?

zenykx

Posts: 69
Nickname: zenykx
Registered: May, 2003

Re: Help on reading html files Posted: Sep 16, 2003 3:09 PM
Reply to this message Reply
You should use a Buffered reader but not necessary the method readLine, but a char[] read method. And you must pay attention that this method returns the number of chars really read, so you must do a while statement till read return -1 (the end of file).

mausam

Posts: 243
Nickname: mausam
Registered: Sep, 2003

Re: Help on reading html files Posted: Sep 16, 2003 11:08 PM
Reply to this message Reply
/*

 * @(#)HTMLFileCopier.java	1.0 03/18/2002

 *

 * Copyright (c) 2001-2002 by Aminur Rashid. All Rights Reserved.

 * 

 * Aminur grants you ("Licensee") a non-exclusive, royalty free, license to use,

 * modify and redistribute this software in source and binary code form,

 * provided that i) this copyright notice and license appear on all copies of

 * the software; and ii) Licensee does not utilize the software in a manner

 * which is disparaging to Aminur.

 * 

 * This software is provided "AS IS," without a warranty of any kind. ALL

 * EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY

 * IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR

 * NON-INFRINGEMENT, ARE HEREBY EXCLUDED. IN NO EVENT WILL AMINUR OR 
 * BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT,

 * INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER

 * CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF THE USE OF

 * OR INABILITY TO USE SOFTWARE, EVEN IF AMINUR HAS BEEN ADVISED OF THE

 * POSSIBILITY OF SUCH DAMAGES.

 * 

 * This software is not designed or intended for use in on-line control of

 * aircraft, air traffic, aircraft navigation or aircraft communications; or in

 * the design, construction, operation or maintenance of any nuclear

 * facility. Licensee represents and warrants that it will not use or

 * redistribute the Software for such purposes.

 */
 
import java.io.*;
public class HTMLFileCopier
{
	public static void main(String args[])
	{
		File directory = new  File("D:\\amin\\birthday");
		if (directory.isDirectory())
		{
			FilenameFilter filter = new HTMLFilter();
			File [] allHTMLFiles = directory.listFiles(filter);
			for (int i=0;i<allHTMLFiles.length ;i++ )
			{
				File destinationDirectory = new File("D:\\amin\\temp");
				if (!destinationDirectory.exists())
					destinationDirectory.mkdir();
				File file = allHTMLFiles[i];
				// Copy file to new directory
				try
				{
					HTMLFileCopier.copy(file.getAbsolutePath(),destinationDirectory.getAbsolutePath()+"\\"+file.getName());
				} catch (Exception e)
				{
					e.printStackTrace();
				}
			}
		}
	}
	
	/**
    *  The static method that actually performs the file copy. Before copying
    *  the file, however, it performs a lot of tests to make sure everything is
    *  as it should be.
    */
   public static void copy( String from_name, String to_name ) throws IOException
   {
      File from_file = new File( from_name );
      // Get File objects from Strings
      File to_file = new File( to_name );
	  
 
      // First make sure the source file exists, is a file, and is readable.
      if( !from_file.exists() )
         log( "FileCopy: no such source file: " + from_name );
 
      if( !from_file.isFile() )
         log( "FileCopy: can't copy directory: " + from_name );
 
      if( !from_file.canRead() )
         log( "FileCopy: source file is unreadable: " + from_name );
 
      // If the destination is a directory, use the source file name
      // as the destination file name
      if( to_file.isDirectory() )
         to_file = new File( to_file, from_file.getName() );
 
      // If the destination exists, make sure it is a writeable file
      // and ask before overwriting it.  If the destination doesn't
      // exist, make sure the directory exists and is writeable.
      if( to_file.exists() )
      {
         if( !to_file.canWrite() )
            log( "FileCopy: destination file is unwriteable: " + to_name );
 
      }
      else
      {
         // if file doesn't exist, check if directory exists and is writeable.
         // If getParent() returns null, then the directory is the current dir.
         // so look up the user.dir system property to find out what that is.
         String parent = to_file.getParent();
         // Get the destination directory
         if( parent == null )
            parent = System.getProperty( "user.dir" );
 
         // or CWD
         File dir = new File( parent );
         // Convert it to a file.
         if( !dir.exists() )
            log( "FileCopy: destination directory doesn't exist: " + parent );
 
         if( dir.isFile() )
            log( "FileCopy: destination is not a directory: " + parent );
 
         if( !dir.canWrite() )
            log( "FileCopy: destination directory is unwriteable: " + parent );
 
      }
 
      // If we've gotten this far, then everything is okay.
      // So we copy the file, a buffer of bytes at a time.
      FileInputStream from = null;
      // Stream to read from source
      FileOutputStream to = null;
      // Stream to write to destination
      try
      {
         from = new FileInputStream( from_file );
         // Create input stream
         to = new FileOutputStream( to_file );
         // Create output stream
         byte[] buffer = new byte[4096];
         // A buffer to hold file contents
         int bytes_read;
         // How many bytes in buffer
         // Read a chunk of bytes into the buffer, then write them out,
         // looping until we reach the end of the file (when read() returns -1).
         // Note the combination of assignment and comparison in this while
         // loop.  This is a common I/O programming idiom.
         while( ( bytes_read = from.read( buffer ) ) != -1 )
         {
            // Read bytes until EOF
            to.write( buffer, 0, bytes_read );
         }
         //   write bytes
      }
      // Always close the streams, even if exceptions were thrown
      finally
      {
         if( from != null )
         {
            try
            {
               from.close();
            }
            catch( IOException e )
            {
               ;
            }
         }
         if( to != null )
         {
            try
            {
               to.close();
            }
            catch( IOException e )
            {
               ;
            }
         }
      }
   }
 
	private static void log(String s)
	{
		System.out.println(s);
	}
}
 
class HTMLFilter implements FilenameFilter 
{ 
			public boolean accept(File f, String name)
			{
				name = name.toLowerCase (); 
				return(name.endsWith (".html") || name.endsWith("htm")) ; 
			}
} 
 

mausam

Posts: 243
Nickname: mausam
Registered: Sep, 2003

Re: Help on reading html files Posted: Sep 16, 2003 11:09 PM
Reply to this message Reply
missed "." in name.endsWith("htm")

Flat View: This topic has 4 replies on 1 page
Topic: Java Applet Previous Topic   Next Topic Topic: need a bit of help

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use