QXLS.INC by Doctor Electron Copyright 2002, 2003 Global Services June 9, 2003 Rare problem with ReadOLEFile SUB fixed. May 30, 2003 Faster version by value type matching in source code. Apr 13, 2003 xlsOPEN fixed for files with "out of order" sectors. Apr 8, 2003 xlsREAD of certain negative numbers fixed. Mar 30, 2003 xlsOPEN of certain large files now faster. Mar 11, 2003 .xls backup to .csv files source code included. Jan 9, 2003 Faster inline code and xlsREAD of Russian characters. FREEWARE: You are free to use this program, but permission from the author is required for sale of this program or packaging it in products for sale or for use in commercial settings. 1.1 KNOWN PROBLEMS: None. MS Excel will corrupt some files it writes. Qxls deciphers the corruption and restores them to normal, much like Excel must do when it reads the corrupted file. Thus, Qxls has hacked the corruption scheme MS uses. One "known solution" is that xlsSAVE will store your data in uncorrupted form. Also, MS invents BIFF8 record codes from time to time. However, the sERR string will report unrecognized codes (see below). 1.2 LIMITATIONS: Please see xlsSAVE and xlsWRITE descriptions below. 1.3 This $INCLUDE file for RapidQ Basic provides direct access to MS Excel 97 (and up) using BIFF8 format without lauching the MS Excel program itself. It may be relatively easy to translate this code into MS VB or other Basic coding environments. Indeed, the RK decoding routine herein was translated from C. The rest is original code. 1.4 Three programs using qxls.inc are included. (1) qxls.bas prints values from .xls files illustrating how the functions may be used. (2) rowtally.bas tests all .xls files in a directory for readability (xlsOPEN successful) and counts all the rows in all of the sheets in all of the files. This program can give a quick look at "progress" if daily operations involve adding data to files increasing their rows with ledger items, new cases, etc. (3) xls-backup.bas, written by a qxls user, will backup all .xls files in a diretory to .csv files in a user-specified directory. For each .xls file with multiple, non-empty sheets, xls-backup will write a separate .csv file for each sheet. This program is more than a demo, since it has practical use for backup as it comes "right out of the box". If your files are not BIFF8, just read them into Excel 97 and up and save in the new format. 2. QUICKSTART (Please see the demo program qxls.bas) Select a workbook index (dimensioned as b in qxls.inc). b is a numeric value from 0 to MaxBooks(a constant). With constant MaxBooks = 7, there may be eight workbooks open simultaneously. These are stored in RAM, so more open workbooks means more usage of system resources, especially for large files. Define FileSpec(b) for workbook b, which is the full path and file name. For example, for the demo.xls included, this is FileSpec(b)="demo.xls" 3. QXLS FUNCTIONS assume b is set to the workbook you want to access. 3.1 ret = xlsOPEN There are no arguments. The values b and FileSpec(b) are used. xlsOPEN opens the file and parses its contents, storing key values in data variables associated with the value b. ret is a numeric variable which you define As LONG in your program. xlsOPEN returns to ret -1 for error or the number of sheets in the file. 3.2 string = xlsREAD(s, i, j) s sheet number from 1 to ret in 3.1 above. i 0-based row number. j 0-based column number. i, j and b above are 0-based, which means the first item has a 0 index. xlsREAD returns four values: (1) a value string, (2) xType number, (3) sERR string and (4) various values and pointers. xUnicode is the pointer to a unicode string structure. sLength is the unicode structure length. xIEEE = 8-byte floating number if applicable (xType). xInt = 4-byte integer if applicable (xType). [For advanced programmers, xCode contains the record type from which the value was obtained.] For example, x$=xlsREAD(2,0,0) 'read cell A1 in sheet 2 x$=xlsREAD(1,3,2) 'read cell C4 in sheet 1 x$ is the value. If a string, xType = -1. If a number, xType >= 0. For numbers, xType gives the RK code which indicates floating or integer and whether or not the stored value was multiplied by 100 (see RK constants in qxls.inc). If xType = 4, then the value was stored as an 8 byte IEEE floating point number (double in RapidQ). All values are returned as strings. Use the VAL(x$) function to convert numeric values to numeric variables. Or read the binary form as follows. For example, IF sERR <> "" THEN (your code to look at that here) IF x$ <> "" THEN SELECT CASE xType 'yourvar=VAL(x$) gives you the value CASE -1: mystring = x$: Goto ItWasString CASE 0, 4: mydoublevar = xIEEE: Goto ItWasDouble CASE 1: mydoublevar = xIEEE: Goto ItWasDoubleX100 'value x 100 CASE 2: mylongvar = xInt: Goto ItWasInteger CASE 3: mylongvar = xInt: Goto ItWasIntegerX100 'value x 100 CASE ELSE: PRINT "xType value violated UN human rights charter" END SELECT ItWasString: '...etc... END IF If the requested cell is not defined or blank in the workbook, x$ will be null. If the cell contains a formula, the resulting formula value is returned by xlsREAD [and xCODE may not be a FORMULA type record]. Since the file has already been read and parsed, there are not many errors that can occur. The string variable sERR is defined for each call to xlsREAD. If sERR = "" (null), then there was no error reported. If the first character of sERR is "#", this indicates special results such as #FORMULA, #BLANK, #BOOLERR and #XXXX where XXXX is a four-character hex value for an unrecognized record type. If you get any #XXXX, please notify the author. If x$ is null and sERR = "#BLANK", it may indicate that somebody deleted what was in that cell. 3.3 ret = xlsCLOSE There are no arguments. The RAM for workbook b is freed and internal data arrays describing that workbook are cleared. ret is -1 for error (b > MaxBooks) or b for success. That value of b may be used to open another workbook file. 3.4 ret = xlsSAVE(File As STRING) Saves workbook b to file specified in "File" string, with ret = file size. Your main program should probably check if File = FileSpec(b) used in xlsOPEN, and if so, prompt the user regarding overwriting/replacing the existing file. Advantages of xlsSAVE include (1) files are saved with all of the data in continuous streams (uncorrupted) and (2) files will be shorter since only the OLE header, BIFF8, OLE FAT, OLE directory and if applicable, extensions of the FAT index, are written. That is, basic code in the original file and any "summary information" and "document summary" is not written. This latter point would exclude, for example, certain authorship information, and may be viewed as an advantage or a limitation, depending on your needs. 3.5 ret = xlsWRITE(s, i, j, value As STRING) xlsWRITE stores the value in sheet s, row i, column j, as in xlsREAD above. If the value represents a number, xlsWRITE determines how to store it (integer or floating, etc). ret = the number of bytes written in the new or replacement record. All relevant pointers are updated with each xlsWRITE. Thus, if the file is saved, it should be readable both by Qxls programs and by Excel. xlsWRITE is "hot off the press" and has several limitations, which coming updates hope to reduce. As with any new code, one should watch for bugs or other problems, and be kind enough to notify the author. The major known limitation at present is that xlsWRITE is limited to changing values or entering new values in "defined rows." A "defined row" is a row that already exists in the file when it is opened. Before this limitation is reduced or completely removed, one can use a template file that already has the number of rows you will need defined. Just save it from Excel, load it with xlsOPEN and away you go. This procedure is not unlike Excel itself which loads a "blank book." 3.6 string = xlsSHEETNAME(s As LONG) 'Returns name of sheet s in book b. 3.7 ret = xlsMINROW(s As LONG) 'Returns first row of sheet s in book b. 3.8 ret = xlsMAXROW(s As LONG) 'Returns last row of sheet s in book b. 3.9 ret = xlsMINCOL(s As LONG) 'Returns first column of sheet s in book b. 3.10 ret = xlsMAXCOL(s As LONG) 'Returns last column of sheet s in book b. 3.11 ret = xlsSHEETLENGTH(s As LONG) 'Returns length in bytes of sheet s in book b. 3.12 ret = xlsNUMBERSHEETS 'Returns number of sheets in book b (like xlsOPEN). 4. EXAMPLES assuming the workbooks b have been opened with xlsOPEN: FOR b = 0 to 3: MyArray$(b) = xlsREAD(2,2,1): NEXT b reads cell B3 of sheet 2 in workbooks 0 to 3. Note that the i argument (2) is row "3" and the j argument (1) is column "B". b=0: price$ = xlsREAD(1,2538,5): b=3: asset$ = xlsREAD(3,24,6) The price is in book 0, sheet 1, cell F2539; assets in book 3, sheet 3, cell G25. b=1: item$ = xlsREAD(mysheet, myrow-1, mycol-1) In workbook 1, get the value at myrow and mycol in mysheet. Please look at the dimensioned variables in qxls.inc for two purposes. First, the variables labeled as scratch may be used by your program between calls to qxls functions, but may be changed by function calls. Second, the workbook specific variables may be accessed by your program, such as FileSpec(b), etc. In general, there should be read only access of the "internal variables" used by the functions. It is better to use the defined functions above so that any future change in the layout of qxls internal data will not require you to change your application program. 5. HOW DO I WRITE EXCEL FILES? There are several options: First, xlsWRITE(s, i, j, v) and xlsSAVE(f) functions are rather powerful although there are limitations in the current version described above. Second, write a .csv file which is just a character delimited text file where each row is an Excel file row. This can be imported into Excel and saved. Your application program can also shell to "excel.exe myfile.csv". 6. DESIGN CONCEPTS The overall goal was to write functions that work reliably. qxls.inc uses indexing data in workbooks and creates some of its own indeces during xlsOPEN to speed up access to data. Thus, in large files, data is accessed almost as fast as in small files. The xlsOPEN function will take more time (1) if xIntegrity = 1, (2) if the file is large, (3) if the file contains more strings in the SST table which is parsed during xlsOPEN to create the indeces qxls uses to access these strings. The internal data tables were set up with the view that new features including writing values could be added. Within the present framework of qxls.inc, it would be relatively easy to add access/modification to cell formating (borders, etc), the codes for kinds of #BOOLERR values, etc. If qxls.inc is improved, your program will not become obsolete because the basic function calls will be the same. qxls.inc does not modify in any way your existing Excel files unless you save a file with exactly the same file specification (as with any program). It is recommended that saved files be renamed or you keep backups of existing files. 7. QXLS.INC WILL NOT READ A FILE A design principle was generally to reject the file if any cross-reference does not check out. For example, BOUNDSHEET records point to the start of the sheet and if this does not agree with the actual start offset of the sheet, the file is rejected. The general idea is you want accuracy in the data you read or knowledge that something is wrong. What to do, if qxls.inc will not parse (open) the file? First, be sure that there are not more sheets in the file than are dimensioned in the constants described below. Also, for very large numbers of strings, the SST constant might need to be larger (display nSST(b) to see that). Try reading the file into MS Excel and possibly saving it in shorter versions or even as .csv which will then be read and saved again as .xls, hopefully restoring file integrity. qxls.inc may think the file format is bad because of a fault in qxls.inc. If you think this is the case, maybe you might send me the file. [Note: please do not send files with information that you or any other person might consider to be private or confidential. Labels or other indicators can be deleted or changed. If the problem centers on particular cells, just copy those cells to the demo.xls included, so that not even your authorship information which may be stored in your files is sent.] The xIntegrity variable may be set to "1" to use qxls.inc as a file integrity tester. With xIntegrity = 1, qxls will open a file parsing all the records in all the sheets. For files greater than one megabyte in length, possibly with tens of thousands of unique strings, the parsing in the present version works well. For example, the SST record contains its count of unique strings. xlsOPEN parses each string to count them and compare this count with the recorded count. If they don't agree, the file is rejected. At every step in xlsREAD any inconsistency or error is reported by the sERR string. Thus, the user may have more confidence that a null return upon reading a particular cell is believable. That is, in some applications, the absence of a particular value is as important as the presence of a value. 8. MORE DETAILS Look at comments in qxls.inc and particularly the constants: CONST MaxBooks& = 7& CONST MaxSheets& = 15& 'if your files have more or less sheets, edit this. CONST MaxBytes& = 4095& 'used as a buffer CONST MaxSST& = 255& 'used by qxls to create indexing to strings The sERR string will notify you that one of the above constants is too small to handle your particular workbooks. Simply edit qxls.inc to solve this. Likewise, some RAM is saved if your programs never open more than three workbooks at a time and you edit the constant in Qxls.inc: CONST MaxBooks& = 2& 'three simultaneous books (0 - 2) 9. CONTACT For more features or problems, please share your thoughts and experience regarding what you need. doctorelectron@cwdom.dm Most recent versions: Net Census, http://www.angelfire.com/space/netcensus/