Inconsistencies in big data

Post on 13-Apr-2017

434 views 0 download

Transcript of Inconsistencies in big data

1

INCONSISTENCIES IN BIG DATA

Prepared by, Minu Joseph

Guided by, Mr. Thomas Varghese

2

Contents• Introduction.• Problem Statement.• 3V’s• Big data.• Defining Big data.• Dimensions of big data.• Sources, applications of big data.• Inconsistencies in big data.• Inconsistency induced learning.• Conclusion.• References.

3

Introduction• A torrent of data is generated and captured in

digital form due to advancement in science and technology.

• Everything we do is increasingly leaving a digital trace.

• Large data sets which are so large and complex that traditional data processing applications are inadequate.

4

Problem Statement

• Big Data-The next big thing in IT industry.• Classification of big data inconsistencies.• Big Data and Big Data analysis in terms of

issues and challenges.• Inconsistency Induced Learning- A tool to turn

big data inconsistencies into helpful formulas for better analysis of results.

5

6

Big Data • Big data can be described by:

VolumeVelocity VarietyVariabilityVeracityComplexity

7

What is BIG DATA?

What is Big Data and how does it work (1).mp4

8

9

Dimensions In Big Data

10

11

12

Levels of Knowledge

13

INCONSITENCIES IN BIG DATA

• Temporal• Spatial• Text• Functional Dependency

14

Temporal Inconsistencies

• Conflicting information.• Data items with conflicting circumstances may

coincide or overlap in time.• SRS often contain inconsistent information.• Inconsistent information affects the

correctness and performance of the system.• Due to concurrent programming errors

Therac-25(1985-1987) lead to 6 accidents.

15

List of temporal inconsistencies

16

Spatial Inconsistencies

• Happens in datasets which include geometric or spatial dimensions.

• Traditional DB systems are enhanced to include spatially referenced data.

• Spatial inconsistencies can arise from Geometric representation of objects Spatial relationship between objects Aggregation of composite objects.

17

Spatial Inconsistencies contd..

18

Text Inconsistencies

• Inconsistencies found in unstructured natural language text.

• Data generated from social media, blogs, emails etc.

• If two texts are referring to same event or entity they are said to be of co-reference.

• Contradiction Detection detects text inconsistencies and has many applications.

19

Text Inconsistencies contd..

20

Functional Dependency Inconsistency

• When certain attribute values are equal, then other attribute values must also be equal.

• Many big databases are stored , aggregated and cleaned through the help of RDBMS.

• Here Functional dependencies play an important role in enforcing the integrity constraints for the database.

21

Functional Dependency Inconsistency contd…

• Variation of Functional Dependencies will result in inconsistencies in data and information.

22

Inconsistency Induced Learning

• Improves data quality• Helps to enhance big data applications.• Accommodates lifelong learning by allowing

successive learning episodes to be triggered through inconsistencies an agent encounters during its problem solving episodes.

• Basic idea is to identify the cause of inconsistency and then apply cause specific heuristics to resolve inconsistencies.

23

Conclusion

• Multidimensional issues and challenges in big data and big data analysis.

• Types of inconsistencies.• How to improve quality of big data analysis.

24

References• www.slideshare.com• dl.acm.org• www.ieeexplore.ieee.org• D. Zhang, On Temporal Properties of Knowledge Base

Inconsistency. Springer Transactions on Computational Science.

• M. Schroeck, R. Shockley, J. Smart, D. Romero-Morales, and P. Tufano, Analytics: the real-world use of big data: how innovative enterprises extract value from uncertain data, Executive Report, IBM Institute for Business Value and Said Business School at the University of Oxford.

• Nasrin Irshad Hussain ,Big Data,www.slideshare.com

25

QUESTIONS?

26