Friday, October 18, 2019

Power of query optimization Asystematic Approach to cost-Based Dissertation

Power of query optimization Asystematic Approach to cost-Based optimization in Data Mining enviroment - Dissertation Example Up till now, there has been extensive research done in order to give database support to the mining operations. Nevertheless, the emphasis in such endeavors has been, most typically, laid upon the mining of a single data set although, most of the times, the user has to look up for multiple data sets that are acquired from various data sources. Thus, for such cases, it is extremely essential for the KDD process to compare the patterns from various data sets and comprehend their relationship with each other. For this purpose, the multiple data sets in a KDDMS require support for the complex queries. Due to this reason, new functionality and optimizations are needed that particularly emphasize over the frequent item set mining. Faster response to queries is the prime function of the query optimization. The data is better known to the semantic optimizer rather than the user. Thus, the semantic optimizer is able to replace the query of the user with another query that provides the same ou tcome more efficiently in lesser time. The efficiency of the new query is due to the execution of less work for the retrieval of the selected result tuples from the data base. The most advanced query optimizers select the one â€Å"best† plan during the time of the compilation to execute a given query (Ramakrishnan and Gehrke, 2000). The cost of execution for the alternative plans is calculated, out which the one is selected that has the overall cheapest cost. Conventionally, the cost is determined on the basis of the average statistics of the overall data since the prime purpose is to identify a single plan for all data. Nevertheless, the significant statistical variations of various data sub-sets may yield poor performance of the query execution (Christodoulakis, 1984). The basic disadvantage is the highly coarse optimization granularity in which just one execution plan is selected for the entire data. Important opportunities for effective query optimization are left out be cause of this sort of â€Å"monolithic† approach (Ramakrishnan and Gehrke, 2000). Thus, the research problem is to augment the cost-based optimization in data mining for patterns, in single and multiple databases. Therefore, the present study will focus on the cost-based optimization of the queries in data mining. 2. Topics covered There are numerous research papers that have been published in the area of Data mining, Data ware-housing and Query Optimization Techniques however the researches in the past do not clearly specify the conditions under which, what kind of query optimizer will probably possess more weight or points than the others. According Yu and Sub (n.d.), rules are deduced from the restriction clauses of the queries that are received at the database and also, from the outcome that they generate. It can also be stated that the cost of each query is different for the approaches through which the two syntactically distinct queries generate the same outcome. Ullman (1998), in his research, explained the principle of semantic query optimization that refers to the use of semantic rules, for instance, to re-generate a query into an equivalent but less expensive query, in order to minimize the cost of query evaluation. Subramanian and Venkataraman (n.d) in their work suggested the architecture to process the queries of complex decision support that incorporates various heterogeneous data sources and puts forward the concept of transient-views and moreover, formulates a cost-based algorithm that requires a query plan as an input and develops an optimized â€Å"covering plan† through reducing the redundancies in the original-input-query plan. According to the research work of Stefan Berchtold et.al (2001), the problem of extracting all objects

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.