Text File Processing
We provide 5 functions : readLine
, readLines
, readLines!
,
writeLine
and writeLines
for basic text file read and write. Character
carriage, a new line, or the combination of carriage and a new line will be treated as
the delimiter of lines when the system reads a line from a text file. When the system
writes a line to a text file, a line delimiter will be appended to the line. The line
delimiter varies depending on the operating system. In WINDOWS systems, the delimiter is
the combination of carriage and a new line. In other systems, the line delimiter is the
new line character.
Read and Write a Single Line
The writeLine function writes a single line to the given file. The function will automatically append a line delimiter to the string. Thus the string shouldn't end with a line delimiter. If the operation succeeds, the function returns 1; otherwise, an IOException will be raised. The readLine function reads a line from the given file. The return line doesn't include the line delimiter. If the file reaches the end, the function will return a NULL object which can be tested by the isVoid function. If operation fails due to other reasons, an IOException will be raised.
x=`IBM`MSFT`GOOG`YHOO`ORCL
eachRight(writeLine, file("test.txt","w"), x)
fin = file("test.txt")
do{
x=fin.readLine()
if(x.isVoid()) break
print x
}while(true);
// output
IBM
MSFT
GOOG
YHOO
ORCL
Read and Write Multiple Lines
The writeLines function writes multiple lines to the given file. The function will automatically append a line delimiter to each line. If the operation succeeds, the function returns the number of lines written; otherwise, an IOException will be raised. The readLines function reads a specified number of lines from the file. The default number of lines to read is 1024. The function returns if the file reaches the end or the given number of lines has been read. The file reaches the end if the returned number of lines is less than specified. If the operation fails due to other reasons, an IOException will be raised.
timer(10){
x=rand(`IBM`MSFT`GOOG`YHOO`ORCL,10240)
eachRight(writeLine, file("test.txt","w"),x)
fin = file("test.txt")
do{ y=fin.readLine() }while(!y.isNull())
fin.close()
};
// output
Time elapsed: 271.035 ms
timer(10){
x=rand(`IBM`MSFT`GOOG`YHOO`ORCL,10240)
file("test.txt","w").writeLines(x)
fin = file("test.txt")
do{ y=fin.readLines(1024)}while(y.size()==1024)
fin.close()
};
// output
Time elapsed: 33.503 ms
The example above compares the efficiency of single line processing with multiple lines processing. The latter is about 9 times faster than the former. The readLines function creates a string vector to return for every call. It takes some time to create a string vector, so it could save more time if we can reuse the same vector as the buffer during repeated function calls. readLines! is such a function that accepts the existing buffer as data holder. The 2 examples below read the same amount of data for 100 times. The readLines! function is faster than the readLines function.
timer(100){
fin = file("test.txt")
do{ y=fin.readLines(1024) } while(y.size()==1024)
fin.close()
};
// output
Time elapsed: 79.511 ms
timer(100){
fin = file("test.txt")
y=array(STRING,1024)
do{ lines = fin.readLines!(y,0,1024) } while(lines==1024)
fin.close()
};
// output
Time elapsed: 56.034 ms