Original Articles
Rui Zhang, Jianwei Wang, Wenju
Abstract
The main task of realizing data sharing in Heterogeneous Databases is Semantic Integration. In relational databases, the primary problem is the identification of same attributes. At present, Comparison of all attributes is the commonly used method to identify the same attributes. However, when an attribute expressed by different data types, considering the enormous differences between metadata information and value information, these same attributes will not be identified by these commonly used methods. And when these same attributes expressed by different data types are identified by these commonly used methods, the disturbance will reduce the accuracy rate of identification. In this paper, an attribute matching method based on data type was proposed. In this method, the attributes Classification will be done firstly, and Pattern matching be done in these attributes expressed by same data type, then feature vectors used to describe the attributes be sorted based on their importance degree. The Experiments showed that the method can effectively filter out the interference data, and improve the efficiency of matching properties, without reducing the precision and recall.