SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 162
Missing Value Evaluation in SQL Queries: A Survey
Smruti Mule1, Antara Bhattacharya2
1Student, Department Of Computer Science and Engineering ,G.H.Raisoni Institute of Engineering and Technology,
Nagpur, India, smrutimule@gmail.com
2 Assistant Professor, Dept. of Computer Science and Engineering,G.H.Raisoni Institute of Engineering and
Technology, Nagpur ,India ,antara.bhattacharya@raisoni.net
------------------------------------------------------------------***----------------------------------------------------------------
Abstract - After decades have been passed of taking efforts
on performance of database, the usability and quality of
database systems have gained more importance in recent
years. However, answering to why-notquestionsi.eevaluating
missing answers in SQL Queries after doing a lot of work has
also gained more attention. The main goal of this research
paper is evaluating missing value in the results obtained with
respect to different SQL Queries. At the same time, this
research paper fulfills the following goals: (i) surveying the
problem of evaluating the missing values i.e. why-not
questions in SQL queries; (ii) searching the techniques for
giving answers to such type of questions using different
numeric and non-numeric data and (iii) comparing those
efficient strategies. This research paper also gives attention
towards related work which were done so far .
Key Words: Missing answers, usability, SQL, top-k
1. INTRODUCTION
After decades, database community is taking efforts on
evaluation of missing values i.e answering to why-not
questions and comparing techniques used for evaluation of
this type of missing answers. The systems which are used
today are more efficient. However, these types of systems
are prominent in determining evaluation of management of
data and evaluation of query [2]. But to the same degree,
these systems are not suitable for end users. Now a day’s ,
users are expecting that systems should be easy to interact
and understand. It means, users are not agreed upon the
results obtained from such type of systems. Users are more
interested in knowing reasons for why the current set of
result does not match their expectation i.e. why current set
of objects are returned in the result by these systems. In
particular, users may interested in knowing the reason for
missing expected data object in result set and also
interested in knowing why unexpected data objects appear
in result received from the system. As a next step, users may
also find proper explanations for these types of questions.
Any system that can provide best explanations for type of
questions mentioned above can be very helpful for users to
better understand their information needs and also to make
system more transparent and interactive to users[3],[5]. At
present, traditional database systems are unable to provide
any kind of exploratory data analysis facilities to support
above types of why and why-not questions. The studies
focusing on improvement of database usability (e.g.,
keyword search [2], similar graph matching [4], and spatial
keywords[5] ), explaining the feature of missing tuples
which are not present in the result of query, are getting
more importance. A why-not question [1], [2] is being posed
when a user is interested in knowing why their expected
tuples are not present in query result. Recently users are
unable to sift directly in the set of data to examine “why-
not?”, due to the reason that interface of query i.e web forms
are restricted by the types of queries expressed by them.
When end users fires SQL query to get data from database
and ask “why-not?”and are unable to search the possible
ways fir getting explanation by means of query interface,
easily cause the situation where users does not use the tool
anymore. This would be the bad situation for database
developers who give their most of the time to develop such
applications. At the same time, supporting different aspects
of giving explanation for missing answers [1], knowledge of
algorithms which are based on query evaluation is required,
that is out of scope for most database developers. Recently,
community of databasestartedthe researchontechniquesto
evaluate missing answers. Out of this, recent works focuses
on giving answers to why-not questions. In this research
paper, answering both why and why-not questions are
addressed for numeric and non-numericdata presentinSQL
queries. In this research paper , aim is to evaluate missing
answers in SQL queries in terms of the above mentioned
aspects in different numeric and non-numericdata thathave
not been investigated by others. Rests of the sections of this
paper are as follows: Section 2 describes related work;
Section 3 presents comparison betweenstrategies;Section4
outlines future work; and Section 5 concludes our paper.
2. RELATED WORK
Previous studies [1],[4],[6],[7],[2],[3]and [5] have done
research on problems of evaluating missing answers in SQL
queries in terms of various different perspective. Xu et al.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 163
explains to a user why their expected answers are not
present in the current set of objects in query result and
returns a refined query that includes expected missing
answers back to the result. Query refinement method [1] is
used which includes numeric attributes. Drawback of this
method is that it is not useful for non-numeric data. Islam et
al. explains the problem of evaluating missing answers in
matching of similar graph which are used for graph
databases. To address this problem, they have proposed an
approximate solution approach as computing the exact
solution is NPhard [4]. The search space for the new query
graph is also established.Drawback isonlysuitableforgraph
databases. He et al. explains the problem of answering why-
not question on two types of top-k queries: the basic top-k
query where the users need tospecifythesetofweighting[6]
and the top-k dominating query where users do not need to
specify the set of weightings as therankingfunction ranksan
object higher if it dominate more objects. Drawback is not
suitable for non-numeric data. Saiful et al. proposes
technique that aims at evaluating the why-not questions in
queries which are reverse skyline [7]. Also technique to
explain modification of why-not and query point which
includes why-not point in reverse skyline of the point called
as query point. It also explains position of query point
anywhere in a region without disturbing existingpointsthat
are reverse skyline. Drawback is only suitable for points of
data whose dynamic skyline contains query points. Chen et
al. proposed that keywords which are special in top-k
queries retrieves the k objects which are best as per the
function which considers both distance which is called a
special distance and similarity of text. Algorithm which is
having optimization sets that performs sequential
examination of setsofcandidatekeywordsisdeveloped.Also
index-based bound-and-prune algorithm [2] is used.
Drawback is only suitable for initial set of query keywords.
Geo et al. define and offer solutions to why-not questions on
MPRQ [3]. He have given a proposal of a framework which
are having three solutions that are efficient as follows : one
which involves modification of original query, one which
involves modification of why-not set, and last that involves
both modification of original query and why-not set. Time
required is more as experiments are performed using data
sets which are synthetic and original. Chen et al. addresses
problem of evaluating the missing answers in terms of
keywords top- k queries by performing the refinement of
keywords which are original that provides user with those
keywords which explains their intention of query. Also the
algorithm having different optimized techniques [5] is
proposed that searches the better solutionwhichisbased on
sets of keyword tested one by one. In some cases,
identification of keywords becomes difficult for users.
3. COMPARISON BETWEEN DIFFERENT
STRATEGIES
SR.NO STRATEGY ADVANTAGE DISADVANTAGE
1 Query
refinement
method
Used for
finding missing
values which
include
numeric
attributes
Not suitable
for non-numeric
data.
2 Index-based
bound-and-
prune
algorithm
Evaluate the
sequence of
keywords
sequentially.
Only suitable for
examination of
keywords present
in query.
3 Metric
probabilistic
range
queries
Define and
offer solutions
to why-
not questions o
n MPRQ.
Time required is
more as
experiments are
performed using
both real and
synthetic data sets.
4 NPhard Explains the
problem of
evaluating
missing values
in matching
similar graph.
Only suitable for
graph databases.
5 Optimizatio
n
Techniques
Determinesthe
good solution
totally based
on keywords
which are
tested at once.
Identification of
exact keywords is a
difficult task for the
users.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 164
6 Ranking
Function
address the
problem of
answering
why-not
questions on
two types of
top-k queries:
the top-k
dominating
queries and
basic top-k
query.
Not suitable for
non-numeric data.
7 Queries
called as
Reverse
skyline.
Describes how
to update the
points calledas
why-not points
and also the
query point.
Only suitable for
points of data
whose dynamic
skylines contains
query points.
4. FUTURE WORK
The problem of evaluating missing answers in SQL Quires i.e
answering to why and why-not questions in other data
settings including social networks are studying currently.In
particular, working is going on the following type of queries
to answer the why and why-not questions.
Social and Graph Queries: Due to the emerging websites of
social networking and their greater impact on our daily life,
there is urgency for social queries on such networks. Social
networks data are generally represented as graphs in
databases. Many websites of social networking develop
recommendation that are automatic over different items
like giving suggestions on making new friends, events etc.
Hence, the feedback for such automated recommendation is
of more importance if user is not satisfied with them always.
Any social networking websites thatcananswersuchtypeof
why and why-not questions will be more interesting to their
users. In future work, this issue can be studied on queries
including data types like Binary Large Object (BLOB),
Boolean and others for missing value restoration and
thereby making the system flexible for maximum databases.
5.CONCLUSION
This paper presents the research agendas for evaluating the
missing answers in SQL Queries. Also shown why it is worth
conducting such research and outlined the various
techniques of giving answers to such type of why-not
questions. These papers have also summarized the related
work done in this area and the future research agendas.
Finally, contributions made so far are presented. Currently
work is in progress and focusing on future research
problems mentioned in this paper.
ACKNOWLEDGEMENT
The Research presented in this paper is possible due to the
data which is available through various reference papers
mentioned in this research paper. I take this opportunity to
express my gratitude towards Asst. Prof. Ms. Antara
Bhattacharya forcontinuoussupportingandencouragement.
I wouldlike to thank our Head Of the DepartmentV.M.Sahare
for giving proper guidance to submit this paper.
REFERENCES
[1] Wenjian Xu, Zhian He, Eric Lo, and Chi-Yin Chow,
“Explaining MissingAnswers toTop-k SQLQueries,”
IEEE Trans. Knowl. Data Eng., vol. 28, no. 8, pp.
2071–2085, July 2016.
[2] Lei Chen , Jianliang Xu, Xin Lin, Christian S.Jensen
and Haibo Hu, “Answering why-not spatial
keyword top-k queries via keyword adaption,” in
Proc. IEEE 32nd Int. Conf. Data Eng.,2016, pp. 697-
708.
[3] Yunjun Gao, Kai Wang, Christian S. Jensen and Gang
Chen, “Answering why-not questions on metric
probabilistic range queries,” in Proc. IEEE 32nd Int.
Conf. Data Eng.,2016, pp. 767-778.
[4] M. S. Islam, C. Liu, and J. Li, “Efficient answering of
why-not questions in similar graph matching,”IEEE
Trans. Knowl. Data Eng., vol. 27, no. 10, pp. 2672–
2686, Oct. 2015.
[5] L. Chen, X. Lin, H. Hu, C. S. Jensen, and J. Xu,
“Answering why not questions on spatial keyword
top-k queries,” in Proc. IEEE 31st Int. Conf. Data
Eng. , 2015, pp. 279–290.
[6] Z. He and E. Lo, “Answering why-not questions on
top-k queries,” IEEE Trans. Knowl. Data Eng., vol.
26, no. 6, pp. 1300–1315, Jun 2014
[7] Z. He and E. Lo, “Answering why-not questions on
top-k queries,” IEEE Trans. Knowl. Data Eng., vol.
26, no. 6, pp. 13Md. Saiful, Z. Rui, and L. Chengfei,
“On answering why-not questionsin reverseskyline
queries,” in Proc. IEEE 28th Int. Conf. Data Eng.,
2013, pp. 973–984.

More Related Content

PDF
Efficient Refining Of Why-Not Questions on Top-K Queries
PDF
IRJET- Text Document Clustering using K-Means Algorithm
PDF
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
PDF
IRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
PDF
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
PDF
A Survey on Sentiment Categorization of Movie Reviews
PDF
Semantic Based Model for Text Document Clustering with Idioms
PDF
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
Efficient Refining Of Why-Not Questions on Top-K Queries
IRJET- Text Document Clustering using K-Means Algorithm
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
IRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
A Survey on Sentiment Categorization of Movie Reviews
Semantic Based Model for Text Document Clustering with Idioms
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION

What's hot (19)

PPTX
Sources of errors in distributed development projects implications for colla...
PDF
Feature selection, optimization and clustering strategies of text documents
PDF
50120130406007
PDF
An empirical performance evaluation of relational keyword search systems
PDF
Modeling Text Independent Speaker Identification with Vector Quantization
PDF
H04564550
PDF
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
PDF
IRJET- Implementation of Automatic Question Paper Generator System
PPTX
Programmer information needs after memory failure
PDF
A Review on Neural Network Question Answering Systems
PPTX
Query formulation process
PDF
Architecture of an ontology based domain-specific natural language question a...
PDF
Advanced Question Paper Generator using Fuzzy Logic
PDF
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
PDF
Conceptual similarity measurement algorithm for domain specific ontology[
PDF
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
PDF
IRJET- Analysis of Question and Answering Recommendation System
PDF
Implementation of Semantic Analysis Using Domain Ontology
PDF
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT’S ACADEMIC PERFORMANCE
Sources of errors in distributed development projects implications for colla...
Feature selection, optimization and clustering strategies of text documents
50120130406007
An empirical performance evaluation of relational keyword search systems
Modeling Text Independent Speaker Identification with Vector Quantization
H04564550
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
IRJET- Implementation of Automatic Question Paper Generator System
Programmer information needs after memory failure
A Review on Neural Network Question Answering Systems
Query formulation process
Architecture of an ontology based domain-specific natural language question a...
Advanced Question Paper Generator using Fuzzy Logic
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
Conceptual similarity measurement algorithm for domain specific ontology[
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
IRJET- Analysis of Question and Answering Recommendation System
Implementation of Semantic Analysis Using Domain Ontology
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT’S ACADEMIC PERFORMANCE
Ad

Viewers also liked (20)

PDF
IRJET-CFD Analysis of conceptual Aircraft body
PDF
IRJET- Sensrless Luenberger Observer Based Sliding Mode Control of DC Motor
PDF
IRJET-ASIC Implementation for SOBEL Accelerator
PDF
IRJET- Assessment of Environmental Impacts during Operational Phase of a T...
PDF
IRJET-Evaluation of the Back Propagation Neural Network for Gravity Mapping
PDF
IRJET-A Virtual Grid-Based Dynamic Routes Adjustment (VGDRA) Scheme for Wir...
PDF
IRJET-A Survey On Opportunistic Piggyback Marking For IP Trace Back
PDF
IRJET-Design And Development Of Sugar Cane Sprout Cutter Machine By Human Pow...
PDF
IRJET-Accessing Information about Programs and Services through a Voice Site ...
PDF
IRJET-A Blind Watermarking Algorithm
PDF
IRJET-Securing Mobile Technology Of Gsm Using A5/1 Algorithm
PDF
IRJET-Solar Power Generation with Capacitor Based Seven Level Inverter System
PDF
IRJET-Power Flow & Voltage Stability Analysis using MATLAB
PDF
IRJET- Document Layout analysis using Inverse Support Vector Machine (I-SV...
PDF
IRJET-A Review on Two Stroke Single Cylinder Compressed Air Engine
PDF
IRJET-A Survey on Stealthy Denial of Service Strategy in Cloud Computing
PDF
IRJET-Survival and Growth Rate of Clarias gariepinus Larvae Fed with Artemia ...
PDF
IRJET-Review Paper On Usage Of Ferrocement Panels In Lightweight Sandwich Con...
PDF
IRJET- Assessment of Environmental Impacts during Operational Phase of a T...
PDF
IRJET-Comparative Analysis of DCT and DWT based novel methods for Watermarking
IRJET-CFD Analysis of conceptual Aircraft body
IRJET- Sensrless Luenberger Observer Based Sliding Mode Control of DC Motor
IRJET-ASIC Implementation for SOBEL Accelerator
IRJET- Assessment of Environmental Impacts during Operational Phase of a T...
IRJET-Evaluation of the Back Propagation Neural Network for Gravity Mapping
IRJET-A Virtual Grid-Based Dynamic Routes Adjustment (VGDRA) Scheme for Wir...
IRJET-A Survey On Opportunistic Piggyback Marking For IP Trace Back
IRJET-Design And Development Of Sugar Cane Sprout Cutter Machine By Human Pow...
IRJET-Accessing Information about Programs and Services through a Voice Site ...
IRJET-A Blind Watermarking Algorithm
IRJET-Securing Mobile Technology Of Gsm Using A5/1 Algorithm
IRJET-Solar Power Generation with Capacitor Based Seven Level Inverter System
IRJET-Power Flow & Voltage Stability Analysis using MATLAB
IRJET- Document Layout analysis using Inverse Support Vector Machine (I-SV...
IRJET-A Review on Two Stroke Single Cylinder Compressed Air Engine
IRJET-A Survey on Stealthy Denial of Service Strategy in Cloud Computing
IRJET-Survival and Growth Rate of Clarias gariepinus Larvae Fed with Artemia ...
IRJET-Review Paper On Usage Of Ferrocement Panels In Lightweight Sandwich Con...
IRJET- Assessment of Environmental Impacts during Operational Phase of a T...
IRJET-Comparative Analysis of DCT and DWT based novel methods for Watermarking
Ad

Similar to IRJET- Missing Value Evaluation in SQL Queries: A Survey (20)

PDF
B017350710
PDF
A Survey on Automatically Mining Facets for Queries from their Search Results
PDF
Query-Based Retrieval of Annotated Document
PDF
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
PDF
A survey on ranking sql queries using skyline and user
DOC
Efficient instant fuzzy search with proximity ranking
PDF
Automated Essay Grading using Features Selection
PDF
Query Recommendation by using Collaborative Filtering Approach
PDF
27 ijcse-01238-5 sivaranjani
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
Performance Evaluation of Query Processing Techniques in Information Retrieval
PDF
IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...
PDF
Evaluating the effectiveness of data quality framework in software engineering
PDF
Application of hidden markov model in question answering systems
PDF
Partitioning of Query Processing in Distributed Database System to Improve Th...
PDF
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
PDF
IRJET- Testing Improvement in Business Intelligence Area
DOC
Research proposal
B017350710
A Survey on Automatically Mining Facets for Queries from their Search Results
Query-Based Retrieval of Annotated Document
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
A survey on ranking sql queries using skyline and user
Efficient instant fuzzy search with proximity ranking
Automated Essay Grading using Features Selection
Query Recommendation by using Collaborative Filtering Approach
27 ijcse-01238-5 sivaranjani
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
Performance Evaluation of Query Processing Techniques in Information Retrieval
IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...
Evaluating the effectiveness of data quality framework in software engineering
Application of hidden markov model in question answering systems
Partitioning of Query Processing in Distributed Database System to Improve Th...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
IRJET- Testing Improvement in Business Intelligence Area
Research proposal

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
web development for engineering and engineering
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
DOCX
573137875-Attendance-Management-System-original
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
composite construction of structures.pdf
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Artificial Intelligence
PPT
Project quality management in manufacturing
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Automation-in-Manufacturing-Chapter-Introduction.pdf
Fundamentals of safety and accident prevention -final (1).pptx
Operating System & Kernel Study Guide-1 - converted.pdf
web development for engineering and engineering
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
573137875-Attendance-Management-System-original
Internet of Things (IOT) - A guide to understanding
CYBER-CRIMES AND SECURITY A guide to understanding
Foundation to blockchain - A guide to Blockchain Tech
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
OOP with Java - Java Introduction (Basics)
composite construction of structures.pdf
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Mechanical Engineering MATERIALS Selection
Artificial Intelligence
Project quality management in manufacturing
Mitigating Risks through Effective Management for Enhancing Organizational Pe...

IRJET- Missing Value Evaluation in SQL Queries: A Survey

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 162 Missing Value Evaluation in SQL Queries: A Survey Smruti Mule1, Antara Bhattacharya2 1Student, Department Of Computer Science and Engineering ,G.H.Raisoni Institute of Engineering and Technology, Nagpur, India, [email protected] 2 Assistant Professor, Dept. of Computer Science and Engineering,G.H.Raisoni Institute of Engineering and Technology, Nagpur ,India ,[email protected] ------------------------------------------------------------------***---------------------------------------------------------------- Abstract - After decades have been passed of taking efforts on performance of database, the usability and quality of database systems have gained more importance in recent years. However, answering to why-notquestionsi.eevaluating missing answers in SQL Queries after doing a lot of work has also gained more attention. The main goal of this research paper is evaluating missing value in the results obtained with respect to different SQL Queries. At the same time, this research paper fulfills the following goals: (i) surveying the problem of evaluating the missing values i.e. why-not questions in SQL queries; (ii) searching the techniques for giving answers to such type of questions using different numeric and non-numeric data and (iii) comparing those efficient strategies. This research paper also gives attention towards related work which were done so far . Key Words: Missing answers, usability, SQL, top-k 1. INTRODUCTION After decades, database community is taking efforts on evaluation of missing values i.e answering to why-not questions and comparing techniques used for evaluation of this type of missing answers. The systems which are used today are more efficient. However, these types of systems are prominent in determining evaluation of management of data and evaluation of query [2]. But to the same degree, these systems are not suitable for end users. Now a day’s , users are expecting that systems should be easy to interact and understand. It means, users are not agreed upon the results obtained from such type of systems. Users are more interested in knowing reasons for why the current set of result does not match their expectation i.e. why current set of objects are returned in the result by these systems. In particular, users may interested in knowing the reason for missing expected data object in result set and also interested in knowing why unexpected data objects appear in result received from the system. As a next step, users may also find proper explanations for these types of questions. Any system that can provide best explanations for type of questions mentioned above can be very helpful for users to better understand their information needs and also to make system more transparent and interactive to users[3],[5]. At present, traditional database systems are unable to provide any kind of exploratory data analysis facilities to support above types of why and why-not questions. The studies focusing on improvement of database usability (e.g., keyword search [2], similar graph matching [4], and spatial keywords[5] ), explaining the feature of missing tuples which are not present in the result of query, are getting more importance. A why-not question [1], [2] is being posed when a user is interested in knowing why their expected tuples are not present in query result. Recently users are unable to sift directly in the set of data to examine “why- not?”, due to the reason that interface of query i.e web forms are restricted by the types of queries expressed by them. When end users fires SQL query to get data from database and ask “why-not?”and are unable to search the possible ways fir getting explanation by means of query interface, easily cause the situation where users does not use the tool anymore. This would be the bad situation for database developers who give their most of the time to develop such applications. At the same time, supporting different aspects of giving explanation for missing answers [1], knowledge of algorithms which are based on query evaluation is required, that is out of scope for most database developers. Recently, community of databasestartedthe researchontechniquesto evaluate missing answers. Out of this, recent works focuses on giving answers to why-not questions. In this research paper, answering both why and why-not questions are addressed for numeric and non-numericdata presentinSQL queries. In this research paper , aim is to evaluate missing answers in SQL queries in terms of the above mentioned aspects in different numeric and non-numericdata thathave not been investigated by others. Rests of the sections of this paper are as follows: Section 2 describes related work; Section 3 presents comparison betweenstrategies;Section4 outlines future work; and Section 5 concludes our paper. 2. RELATED WORK Previous studies [1],[4],[6],[7],[2],[3]and [5] have done research on problems of evaluating missing answers in SQL queries in terms of various different perspective. Xu et al.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 163 explains to a user why their expected answers are not present in the current set of objects in query result and returns a refined query that includes expected missing answers back to the result. Query refinement method [1] is used which includes numeric attributes. Drawback of this method is that it is not useful for non-numeric data. Islam et al. explains the problem of evaluating missing answers in matching of similar graph which are used for graph databases. To address this problem, they have proposed an approximate solution approach as computing the exact solution is NPhard [4]. The search space for the new query graph is also established.Drawback isonlysuitableforgraph databases. He et al. explains the problem of answering why- not question on two types of top-k queries: the basic top-k query where the users need tospecifythesetofweighting[6] and the top-k dominating query where users do not need to specify the set of weightings as therankingfunction ranksan object higher if it dominate more objects. Drawback is not suitable for non-numeric data. Saiful et al. proposes technique that aims at evaluating the why-not questions in queries which are reverse skyline [7]. Also technique to explain modification of why-not and query point which includes why-not point in reverse skyline of the point called as query point. It also explains position of query point anywhere in a region without disturbing existingpointsthat are reverse skyline. Drawback is only suitable for points of data whose dynamic skyline contains query points. Chen et al. proposed that keywords which are special in top-k queries retrieves the k objects which are best as per the function which considers both distance which is called a special distance and similarity of text. Algorithm which is having optimization sets that performs sequential examination of setsofcandidatekeywordsisdeveloped.Also index-based bound-and-prune algorithm [2] is used. Drawback is only suitable for initial set of query keywords. Geo et al. define and offer solutions to why-not questions on MPRQ [3]. He have given a proposal of a framework which are having three solutions that are efficient as follows : one which involves modification of original query, one which involves modification of why-not set, and last that involves both modification of original query and why-not set. Time required is more as experiments are performed using data sets which are synthetic and original. Chen et al. addresses problem of evaluating the missing answers in terms of keywords top- k queries by performing the refinement of keywords which are original that provides user with those keywords which explains their intention of query. Also the algorithm having different optimized techniques [5] is proposed that searches the better solutionwhichisbased on sets of keyword tested one by one. In some cases, identification of keywords becomes difficult for users. 3. COMPARISON BETWEEN DIFFERENT STRATEGIES SR.NO STRATEGY ADVANTAGE DISADVANTAGE 1 Query refinement method Used for finding missing values which include numeric attributes Not suitable for non-numeric data. 2 Index-based bound-and- prune algorithm Evaluate the sequence of keywords sequentially. Only suitable for examination of keywords present in query. 3 Metric probabilistic range queries Define and offer solutions to why- not questions o n MPRQ. Time required is more as experiments are performed using both real and synthetic data sets. 4 NPhard Explains the problem of evaluating missing values in matching similar graph. Only suitable for graph databases. 5 Optimizatio n Techniques Determinesthe good solution totally based on keywords which are tested at once. Identification of exact keywords is a difficult task for the users.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 01 | Jan -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 164 6 Ranking Function address the problem of answering why-not questions on two types of top-k queries: the top-k dominating queries and basic top-k query. Not suitable for non-numeric data. 7 Queries called as Reverse skyline. Describes how to update the points calledas why-not points and also the query point. Only suitable for points of data whose dynamic skylines contains query points. 4. FUTURE WORK The problem of evaluating missing answers in SQL Quires i.e answering to why and why-not questions in other data settings including social networks are studying currently.In particular, working is going on the following type of queries to answer the why and why-not questions. Social and Graph Queries: Due to the emerging websites of social networking and their greater impact on our daily life, there is urgency for social queries on such networks. Social networks data are generally represented as graphs in databases. Many websites of social networking develop recommendation that are automatic over different items like giving suggestions on making new friends, events etc. Hence, the feedback for such automated recommendation is of more importance if user is not satisfied with them always. Any social networking websites thatcananswersuchtypeof why and why-not questions will be more interesting to their users. In future work, this issue can be studied on queries including data types like Binary Large Object (BLOB), Boolean and others for missing value restoration and thereby making the system flexible for maximum databases. 5.CONCLUSION This paper presents the research agendas for evaluating the missing answers in SQL Queries. Also shown why it is worth conducting such research and outlined the various techniques of giving answers to such type of why-not questions. These papers have also summarized the related work done in this area and the future research agendas. Finally, contributions made so far are presented. Currently work is in progress and focusing on future research problems mentioned in this paper. ACKNOWLEDGEMENT The Research presented in this paper is possible due to the data which is available through various reference papers mentioned in this research paper. I take this opportunity to express my gratitude towards Asst. Prof. Ms. Antara Bhattacharya forcontinuoussupportingandencouragement. I wouldlike to thank our Head Of the DepartmentV.M.Sahare for giving proper guidance to submit this paper. REFERENCES [1] Wenjian Xu, Zhian He, Eric Lo, and Chi-Yin Chow, “Explaining MissingAnswers toTop-k SQLQueries,” IEEE Trans. Knowl. Data Eng., vol. 28, no. 8, pp. 2071–2085, July 2016. [2] Lei Chen , Jianliang Xu, Xin Lin, Christian S.Jensen and Haibo Hu, “Answering why-not spatial keyword top-k queries via keyword adaption,” in Proc. IEEE 32nd Int. Conf. Data Eng.,2016, pp. 697- 708. [3] Yunjun Gao, Kai Wang, Christian S. Jensen and Gang Chen, “Answering why-not questions on metric probabilistic range queries,” in Proc. IEEE 32nd Int. Conf. Data Eng.,2016, pp. 767-778. [4] M. S. Islam, C. Liu, and J. Li, “Efficient answering of why-not questions in similar graph matching,”IEEE Trans. Knowl. Data Eng., vol. 27, no. 10, pp. 2672– 2686, Oct. 2015. [5] L. Chen, X. Lin, H. Hu, C. S. Jensen, and J. Xu, “Answering why not questions on spatial keyword top-k queries,” in Proc. IEEE 31st Int. Conf. Data Eng. , 2015, pp. 279–290. [6] Z. He and E. Lo, “Answering why-not questions on top-k queries,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 6, pp. 1300–1315, Jun 2014 [7] Z. He and E. Lo, “Answering why-not questions on top-k queries,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 6, pp. 13Md. Saiful, Z. Rui, and L. Chengfei, “On answering why-not questionsin reverseskyline queries,” in Proc. IEEE 28th Int. Conf. Data Eng., 2013, pp. 973–984.