Top

Collecting Solutions with "all"

French

German

Italian

Spanish

Portuguese

Japanese

Korean

Chinese

Database Files

Database Files in FormulaOne
Files as Variables
Reading Database Files
Positioning the File Pointer
Altering Database Files
Database Files as Predicates
Index Files
Manipulating Files
Summary

A FormulaOne database file is a list of values of a given type, stored in an efficient binary format. In the following discussion, we'll use the term database file to refer to the concept of a FormulaOne file, and external file to refer to the files as they appear in OS file system. When there's no possibility of confusion, we'll just use file. This section will discuss how files interact with, and are manipulated by, programs.

Database Files in FormulaOne

A database file, in the FormulaOne world, is like something in-between a module, a predicate, and a variable. It's defined like a module, but can be accessed in programs like a predicate and passed in queries like a variable. The actual contents of a database file is a set of records stored in a file on secondary storage, such as hard disk. Each record is a value of a FormulaOne type associated with the file by the programmer.

Let's look at a typical definition of a file, to see how it links to programs and to secondary storage. Say we had defined the type P_data_t. Then we could define the type of file with records of type P_data_t as follows:

P_data_ft = file P_data_t

(In this document, we use the convention that file types end with the characters '_ft'.) The file is treated as a data structure of the given type, which must be a file type. For example, if the database file storing the actual records is c:\Database\pdata.dbs, we can define P_data

P_data :< P_data_ft = 'c:\\Database\\pdata.dbs':P_data_ft

The record type of a file does not have to be a tuple; it can be a union expression or basic type (such as S), and it can even be a recursively-defined tuple, like Tree_t. So an entire tree structure can be stored on a disk file for later recovery into another run of a query.

Files as Variables

A file such as P_data can be referred to in the FormulaOne query processor as an input/output variable, which can be passed to procedures in queries. We can think of its structure as being a list of values of the file type, along with a file pointer, which points to some value within that list. When we pass a file to a procedure in a query, initially the file pointer points to the beginning of the file.

start->|A|....|P|Q|R|....|Z|<-end
  ptr->|

Then, as we advance the file pointer, it points to records further and further on in the file.

start->|A|....|P|Q|R|....|Z|<-end
           ptr->|

Records are deleted or inserted by deleting them from or inserting them into the list at the point to which the file pointer is pointing. In the program, however, we cannot refer to file variables as lists with pointers. We can only pass them as arguments to predicates and procedures.

Several predefined procedures exist for manipulating files and file pointers. They are:

DbAccess
DbGet
DbSkip
DbEof
DbRewind
DbEofSet
DbPut
DbInsert
DbDelete
DbTruncate

The first six change the file pointer without affecting the contents of the file, while the last three change the contents. These procedures are not accessible automatically. They reside in the standard module files, that comes with the FormulaOne package. If one of your modules uses the file procedures, you must therefore specify that your module uses the file module. When you create or update the module you should enter names of used modules into the module header.

Reading Database Files

The predefined procedures DbAccess and DbSkip can be used to read database files, within procedures.

{prints each record in 'f'}

proc Printall(f:.P_data_ft) iff
    if DbAccess(f, person) then
        Print(person,'\n') & DbSkip(f,1) & Printall(f)
    end

If we have the above procedure in a compiled module, and the P_data database file defined as above, we can issue the query

Printall(P_data)

and expect it to print all the records in the file (if there are any).
Typical output from this query might be

'Heather',Female,(1961,7,2),(0,1,1),'Musician'
'Isadora',Female,(1924,3,22),(0,1,1),'Homemaker'
'Steven',Male,(1921,9,20),(0,1,1),'Engineer'

DbAccess file predicates, takes two arguments, a file and an output variable of the record type of the file. If T is the record type, then DbAccess can act as a procedure with the following declaration.

proc DbAccess(f:<file T, rec:>T)

DbAccess, like all the other file procedures, is polymorphic. It can act as a procedure with the given types of parameters for any type T. It has the effect of setting the output variable to the next record in the file after the file pointer. For instance, if the file is in the following state, the record read into the second argument will be Q.

start->|A|....|P|Q|R|....|Z|<-end
  ptr->|

If the pointer is pointing after the last record in the file - that is, if there are no records to fetch - DbAccess fails. DbAccess never actually changes the file pointer, even after it has copied the record into its output parameter. That is the job of DbSkip. DbSkip takes two arguments, a file and an integer containing the number of records to skip.

proc DbSkip(f:.file T, n :<L)

It has the effect of advancing the file pointer past the record it is currently pointing to, by the given number of records. Before DbSkip(f, 1):

start->|A|....|P|Q|R|....|Z|<-end
           ptr->|

After DbSkip(f, 1):

start->|A|....|P|Q|R|....|Z|<-end
             ptr->|

If the number of records in the file is less than the number requested to skip, DbSkip moves the pointer after the last record. This includes the case in which the pointer is at the end of the file (0 records left). DbSkip never fails unless its second argument is less than 0. So in the processing of the query above, DbAccess gets into variable person the next record after the file pointer, and each call to DbSkip advances the file pointer by one record. DbAccess fails when the end of the file is reached, causing the recursion to "bottom out". DbAccess is a procedure with no input/output parameters, so it can be used in if formulas. In circumstances where you don't need to test whether or not a record is there, you may more likely use the DbGet procedure predicate than DbAccess in your programs.

proc DbGet(f:.file T, rec:>T)

DbGet(f,x) has exactly the same effect as DbAccess(f,x) & DbSkip(f,l); that is, it gets the current record into x and advances the pointer by one record - unless the pointer is at the end of the file, in which case the DbGet fails. DbGet works well in combination with the procedure DbEof, which succeeds if and only if its argument is a file positioned at the end.

proc DbEof(f :<file T)

DbEof also has no input/output parameters, and can be used in ifs. Using DbGet and DbEof, we can write an alternative version of our earlier Printall procedure.

{prints each record in 'f'}

proc Printall1(f:.P_data_ft) iff
    if DbEof(f) then
        Print('End of file')
    else
        DbGet(f,person) & Print(person,'\n') & Printall1(f)
    end

Positioning the File Pointer

Two procedures are provided for positioning the file pointer within the file, DbRewind and DbEofSet. DbRewind takes an input/output file variable as its only argument,

proc DbRewind(f:.file T)

It has the effect of simply setting the pointer back to the beginning of the file, before the first record.

start->|A|....|P|Q|R|....|Z|<-end
  ptr->|

DbEofSet also takes an input/output file as its argument.

proc DbEofSet(f:.file T)

It has the effect of setting the pointer to the end of the file, after the last record.

start->|A|....|P|Q|R|....|Z|<-end
                      ptr->|

DbEofSet is not very useful when reading a file, and is usually used in conjunction with file-writing routines. However, DbRewind can come in handy when reading, as in this Read_circ routine which treats the file as a circular list of records:

{ read 'pers' from 'f', rewinding to the start if end-of-file }

proc Read_circ(f:.P_data_ft, pers:>P_data_t) iff
    if DbEof(f) then
        DbRewind(f)
    end & DbGet(f,pers)

Note that in the above procedure, if the file is empty, the DbGet call will fail, so the procedure will fail. This behavior is probably consistent with what we want from it.

Altering Database Files

So far, we have dealt only with procedures which alter the file pointer, if they alter anything at all. But it is also possible to alter the contents of the file, inserting and deleting records. The DbPut, DbInsert, DbDelete, and DbTruncate procedures perform these actions.

DbPut

DbPut takes two arguments, an input/output file variable and a value of the file's record type.

proc DbPut(f:.file T, x:<T)

It inserts a record, with given value, before the next record after the file pointer, and then skips past the inserted record. Say the file is in the following state.

start->|A|....|P|Q|R|....|Z|<-end
           ptr->|

If a record - we'll call it V - is inserted with DbPut, the new state of the file will be the following.

start->|A|....|P|V|Q|R|...|Z|<-end
             ptr->|

DbPut skips the inserted record automatically, so that repeated DbPuts insert records in sequence.
The following code demonstrates the usage of DbRewind, DbTruncate and DbPut:

 proc CreateP_dataFile(f:.P_data_ft) iff
    DbRewind(f) & DbTruncate(f) &   {delete any existing records first}
    DbPut(f,('Heather',Female,(1961,7,2),(0,1,1),'Musician')) &
    DbPut(f,('Isadora',Female,(1924,3,22),(0,1,1),'Homemaker')) &
    DbPut(f,('Steven',Male,(1921,9,20),(0,1,1),'Engineer'))

Note the code line containing DbRewind(f) & DbTruncate(f). These two calls in effect delete all previously existing records from the file. Without this line, multiple calls to CreateP_dataFile would keep adding the same three records to the database file with each call.

DbInsert

DbInsert is an alternative to DbPut. DbInsert uses the same arguments as DbPut. Using DbInsert creates a database file in sorted order. Consider the routine CreateP_dataFile1:

proc CreateP_dataFile1(f:.P_data_ft) iff
    DbInsert(f,('Isadora',Female,(1924,3,22),(0,1,1),'Homemaker')) &
    DbInsert(f,('Heather',Female,(1961,7,2),(0,1,1),'Musician')) &
    DbInsert(f,('Steven',Male,(1921,9,20),(0,1,1),'Engineer')) &
    DbInsert(f,('Isadora',Female,(1924,3,22),(0,1,1),'Homemaker')) &
    DbInsert(f,('Steven',Male,(1921,9,20),(0,1,1),'Engineer'))

The routine creates a database file that contains only the following records:

'Heather',Female,(1961,7,2),(0,1,1),'Musician'
'Isadora',Female,(1924,3,22),(0,1,1),'Homemaker'
'Steven',Male,(1921,9,20),(0,1,1),'Engineer'

Unlike using DbPut, DbInsert places the records within the database file based on the actual records content. The records are inserted in standard order for sorting. Standard order of sorting basically specifies a method to determine which record is "smaller". If a record of the same value already exists, the record in not inserted again, otherwise the record is inserted after the first record of a "smaller" value. To keep the whole database file in sorted order, do not mix DbPut and DbInsert.

DbDelete

DbDelete has one argument, the input/output file variable,

proc DbDelete(f:.file T)

It removes the record after the pointer, as if doing a DbSkip and deleting whatever it encountered on the way. As with DbSkip, if the pointer is at the end of the file, DbDelete fails. Before DbDelete:

start->|A|....|P|Q|R|....|Z|<-end
           ptr->|

After DbDelete:

start->|A|....|P|R|....|Z|<-end
           ptr->|

DbTruncate

DbTruncate also takes a single input/output file,

proc DbTruncate(f:.file U)

It has the effect of deleting every record in the file from the current record on, that is, "truncating" the file at the current file pointer location. Before DbTruncate:

start->|A|....|P|Q|R|....|Z|<-end
           ptr->|

After DbTruncate:

start->|A|....|P|<-end
           ptr->|

Database Files as Predicates

We have seen how database files can be accessed as variables, for the purposes of explicit reading and updating. But a program can also refer to a database file as a predicate, using the file name as a predicate name. To do this, the program module must use both the files module and the database file itself. In the module header include both as used. Once you have done this, each file in the list of used modules can be referred to in the program as if it were a predicate whose arguments were the fields of the tuple which is the file record type.
The file effectively acts as a module containing one predicate, which is named the same as the file. For example, say we had declared P_data as above, and were writing a module which used P_data. Then we could include the following predicates in the module:

pred Man(person_name::S) iff
    P_data(person_name, Male, b, d, c)

pred Woman(person_name::S) iff
    P_data(person_name, Female, b, d, c)

The queries:

all Man('Steven')

all Woman('Steven')

will succeed and fail respectively. P_data, when used in this way, acts as a predicate which matches its arguments with any of the records in the file.
For example, if P_data contained the records

('Heather',Female,(1961,7,2),(0,1,1),'Musician')
('Isadora',Female,(1924,3,22),(0,1,1),'Homemaker')
('Steven',Male,(1921,9,20),(0,1,1),'Engineer')

then P_data as a predicate would be equivalent to the following:

pred P_data_pred(tuple:>P_data_t) iff
    tuple = ('Heather', Female,(1961,7,2),(0,1,1),'Musician')
  | tuple = ('Isadora', Female,(1924,3,22),(0,1,1),'Homemaker')
  | tuple = ('Steven', Male,(1921,9,20),(0,1,1),'Engineer')

P_data can also be accessed in queries, as any regular predicate can; for instance,

all P_data(x)

all P_data_pred(x)

will list all the records in the P_data file, and

all n P_data(n, g, b, d, c)

all n P_data_pred(n, g, b, d, c)

will list just the names (the expression in the parentheses being a "deconstructor" term which gets the value of the output parameter as it comes out).
Note that in the predicate definition above, the string parameters and gender appear as symbolic parameters, while the two Date_t parameters appear as output. This is the general case with all files being used as predicates. Parameters of type S, I, L, R, or enumerated types (that is, the basic types) are symbolic, while those of all other types (including tuple types) are output. This means that any kind of argument can be given to parameters of basic types, whereas symbolic arguments cannot be given to parameters of non-basic types.
A database file predicate is treated as just another predicate in most ways. Within procedures, all arguments to database file predicates must be full-value. File predicates may also appear as if formula conditions, or preceded by the "not" connective (~), if all arguments are full-value.

Index Files

To search a database file sequentially can be prohibitably inefficient for large files. Index files allow to avoid time-consuming searches through an entire database file in order to find a particular record. FormulaOne implements a set of routines to allow fast database searches of a database record of any type. Prior to creating a database, a decision should be made about the database record fields we are likely to search. Let's consider a database of records of type P_data, defined previously as

P_data_t = (name:S,          {person's name}
            gender:Gender_t,
            b_date:Date_t,   {date of birth}
            d_date:Date_t,   {date of death}
            comment:S)

Let us also assume we will want to search the database for a specific name, birthday date b_date, gender and the year of birth b_date.year. For this purpose we define the following database file type:

P_data_ft2 = file P_data_t[gender,name,b_date,b_date.year]

Creating a database file of the type P_data_ft2 will not only create a database file containing the actual records of type P_data_t, but also index files corresponding to the record fields name, birthday date b_date, gender and the year of birth b_date.year. There will be one separate index file for each index. Note that the order of indeces is irrelevant, but two file types with the same indeces but with the indeces in a different order are considered two different types. So when we create a database file pdataIx.dbs using the following declaration:

P_data2 :< P_data_ft2 = 'pdataIx.dbs':P_data_ft2

several files will be created: pdataIx.dbs, pdataIx.dbs.000, pdataIx.dbs.001, pdataIx.dbs.002, pdataIx.dbs.003. Inserting or deleting a record in the database file will automatically update all index files, so modifying an indexed database files can be significantly slower than modifying a database file without indeces, however searching the indexed database will be significantly faster than searching a database file without indeces.
To illustrate the usage of various index file routines, le us first create a simple indexed database:

proc CreateIndexFile(f:.P_data_ft2) iff
    DbRewind(f) & DbTruncate(f) &   {delete any existing records first}
    DbPut(f,('Heather',Female,(1961,7,2),(0,1,1),'Musician')) &
    DbPut(f,('Isadora',Female,(1924,3,22),(0,1,1),'Homemaker')) &
    DbPut(f,('Steven',Male,(1921,9,20),(0,1,1),'Engineer')) &
    DbPut(f,('Alexa',Female,(1973,10,11),(0,1,1),'Student')) &
    DbPut(f,('Douglas',Male,(1992,5,14),(0,1,1),'Student')) &
    DbPut(f,('Lukas',Male,(1989,6,28),(0,1,1),'Student')) &
    DbPut(f,('Megan',Female,(1992,5,14),(0,1,1),'Student')) &
    DbPut(f,('Boris',Male,(1955,7,25),(0,1,1),'Programmer')) &
    DbPut(f,('John',Male,(1945,7,15),(0,1,1),'Trucker')) &
    DbPut(f,('Ernest',Male,(1955,7,25),(0,1,1),'Programmer')) &
    DbPut(f,('Patrick',Male,(1955,2,17),(0,1,1),'Programmer')) &
    DbPut(f,('Jimmy',Male,(1955,2,17),(0,1,1),'')) &
    DbPut(f,('Emanuella',Female,(1955,12,25),(0,1,1),'born on Christmas Day!')) &
    DbPut(f,('Ella',Female,(1924,3,1),(0,1,1),'Retired'))

Note that in the code above we did not bother to keep the database file records in any particular sorted order. To create the database file and all index files, run the query

CreateIndexFile(P_data2)

Having successfully created the files, now we can easily search the database for various records. For example, the code in DumpAllBirthdayYear will print out all records with the given year of birth:

proc DumpAllBirthdayYear(f:.P_data_ft2,year:<I) iff
    DbSeek(f.b_date.year,year) & DumpAllBirthdayYear1(f)

proc DumpAllBirthdayYear1(f:.P_data_ft2) iff
    if DbAccess(f,y) then
        Print(y,'\n') &
        DbSeekNextEq(f.b_date.year) & DumpAllBirthdayYear1(f) 
    else
        Print('<EOF>')
    end

For example, we can find all records with the year of birth 1955 by issuing the query

DumpAllBirthdayYear(P_data2,1955)

Given the records as entered in the routine CreateIndexFile, the results will be

'Boris',Male,(1955,7,25),(0,1,1),'Programmer'
'Ernest',Male,(1955,7,25),(0,1,1),'Programmer'
'Patrick',Male,(1955,2,17),(0,1,1),'Programmer'
'Jimmy',Male,(1955,2,17),(0,1,1),''
'Emanuella',Female,(1955,12,25),(0,1,1),'born on Christmas Day!'
<EOF>

It is easy to create similar routines to search records using different record fields.
Although the records in the database file are in no particular order, the index files allow us to browse all records in an ordered way. The order can be either ascending, or descending, using pairs of routines DbSeekFirst and DbSeekNext or DbSeekLast and DbSeekPrev respectively. For example, the code in DumpAllNames will print out all records sorted in ascending order based on the filed name:

proc DumpAllNames(f:.P_data_ft2) iff
    DbSeekFirst(f.name) & DumpAllNames1(f)

proc DumpAllNames1(f:.P_data_ft2) iff
    if DbAccess(f,y) then
        Print('\nname:', y) &
        DbSeekNext(f.name) & DumpAllNames1(f) 
    else
        Print('<EOF>')
    end

There are several additional runtime index file seek routines, they all either find the corresponding record and position the database file pointer to point to the record or position the file pointer to the end of the file. Once the file pointer has been set to point to a record, you can use DbAccess, or DbGet to read the record or DbDelete to delete the record. Deleting the record will delete all corresponding index file entries for the record automatically. Once a record has been deleted, any record comparisons are ill defined; therefore start a new search with an absolute DbSeek instead of relative seeks such as DbSeekNextEq, DbSeekNextGt etc. The various seek routines are described in more detail in the section describing the Runtime Library: DbSeek, DbSeekNext, DbSeekNextEq, DbSeekPrev, DbSeekPrevEq, DbSeekFirst, DbSeekLast, DbSeekLt, DbSeekGt

Manipulating Files

Files other than database files can be also manipulated from procedures in much the same way as database files can be. Two other file types have been predefined: Bin and Ascii. An Ascii file is interpreted as a standard text file, consisting of lines of printable text. Each string record in the file corresponds to one of the lines, so that DbGet routine (for instance) gets lines of the file in sequence, from first line to last line. For example, the following code will read and print out a text file:

proc DumpAscii(f:.Ascii) iff
    if DbAccess(f,y) then 
        Print(y,'\n') & DbSkip(f,1) & DumpAscii(f)
    end

DumpAscii can be called to print out a particular text file using the file name (keep in mind the file names are case insensitive):

DumpAscii('c:\\Temp\\License.txt')

Records, i.e. lines, cannot be deleted from an Ascii file, and no record can be inserted in the middle. If either of these operations is attempted, an error condition is raised. However, records can be appended to the end of the file with the DbPut procedure. A Bin file is interpreted simply as a sequence of bytes. Each DbGet operation will get the next byte in the file, regardless of whether the file is in text format, database file format, or any other format. These one-byte records cannot be deleted, nor can any records be inserted in the middle. The DbPut procedure, however, has a special behavior with Bin files: instead of inserting a byte, it overwrites the current byte, therefore doing the equivalent of a DbDelete followed by a DbPut.

Summary

File format can be database, Ascii or Bin
Database files can be used as variables or as predicates.
A file used as a variable has a file pointer associated with it, which indicates the location at which operations will take place within the file.
The file predicates DbAccess, DbSkip, DbGet, and DbEof can be used to read a file.
DbRewind and DbEofSet can be used to position the file pointer at the beginning or end of the file.
DbPut, DbInsert, DbDelete, and DbTruncate are the only file predicates that change the contents of a file.
Ascii and Bin files can be manipulated in procedures as if they were database files.
For large databases use index files.
Index file search operations are significantly faster then sequential database file search.

Top

Building Types

Collecting Solutions with "all"