After data is loaded into a warehouse, additional measures must be taken to ensure that the data in the warehouse is periodically refreshed to reflect updates to the data, sources, and to periodically purge data that is too old from the warehouse (perhaps onto archival media). Observe the connection between the problem of refreshing warehouse tables and asynchronously maintaining replicas of tables in a distributed DBMS. Maintaining replicas of source relations is an essential part of warehousing, and this application domain is an important factor in the popularity of asynchronous replication, despite the fact that asynchronous replication violates the principle of distributed data independence. The problem of refreshing warehouse rabies (which is materialized views over tables in the source databases) has also renewed interest in incremental maintenance of materialized views.
An important task in maintaining a warehouse is keeping track of the data currently stored in it; this book keeping is done by storing information about the warehouse data in the system catalogs. The system catalogs associated with a warehouse are very large, and are often stored and managed in a separate database called a metadata repository. The size and complexity of the catalogs is in part due to the size and complexity of the warehouse itself, and in part because a lot of administrative information must be maintained. For example, we must keep track of the source of each warehouse table, and when it was last refreshed, in addition to describing its fields.
The value of a warehouse is ultimately in the analysis that it enables. The data in a warehouse is typically accessed and analyzed using a variety of tools, including OLAP query engines, data mining algorithms, information visualization tools, statistical packages and report generators.
【New Words】
asynchronous
不同时的,异步的
partition
分割,划分,区分
catalog
目录,编目录
mismatch
错配,使搭配不当
metadata
元数据
scalability
可扩展的
OCBC(OpenDatabaseConnectivity)
开放数据库可连接性
Exercise
1. Multiple Choices
(1)A database file is made up of _____.
A. records
B. characters
C. fields
D. text
(2)The database program searches a certain record for a mach in particular field to whatever data you specify. This called ______ a database.
A. displaying
B. sorting
C. calculating
D. querying
(3)______ is the latest trend in database software.
A. Object-oriented design
B. Web database integration
C. Data warehousing
D. Standalone modules
(4)A hierarchical data model always starts with a ______.
A. children
B. parent
C. first
D. root
(5)A _____ assortment of data cannot be referred to as a database.
A. coherent
B. inherent
C. logical
D. random
(6)A database will provide different people varied ______ to the same data.
A. retrieval
B. access
C. reach
D. record
(7)The DBMS can maintain the integrity of the database by allowing ______to update the same record at the same time.
A. no more than one user
B. more than one user
C. two customers with similar customer numbers
D. company specialists
(8)The contribution of “Internet database website” is ______.
A. enhance the efficiency
B. cut down the cost
C. save the manpower
D. any changes in database will be modified synchronically on the website
(9)An item of data in database such as a number, a name, or an address iscalled ______.
A. field
B. schema
C. record
D. subject
(10)The DBMS accepts requests for data and instructs the operating system to _____ the appropriate data.
A. take in
B. transfer
C. reject
D. accept
2. Translate the following phrase into Chinese
(1)Logical schema
(2)Parent node
(3)System-oriented
(4)Index card
(5)Atomic data type
(6)data structure
(7)return address
(8)fatal error
3. Identify the following to be True or False
(1)Database managements make it easy to change the order of records in a
file.
(2)A basic feature of all database programs is the capability to locate records in the file quickly.
(3)Data security prevents unauthorized users from viewing or updating the database.
(4)A database management system (DBMS) is an extremely complex set of software programs that controls the organization, storage and retrieval of data (fields, records and files) in a database.
(5)The DBMS can reject valid dates entered into data fields, alphabetic data entered into money fields.
(6)Data integrity is best served when there is one controlling source for validation of data.
4. Translate the following passage into Chinese
Electronic Data Interchange (EDI) may be most easily understood as the replacement of paper-based purchase orders with electronic equivalents. It is actually much broader in its application than procurement process, and its impacts are far greater than mere automation. EDI offers the prospect of easy and cheap communication of structured information throughout the corporate community, and is capable of facilitating much closer integration among hitherto remote organizations.