Post on 30-May-2015
description
Incentive Compatible Privacy-Preserving Data Analysis
M.V.Rupa Sri
310204120033
ABSTRACT
• In many cases, competing parties who have private data may collaboratively conduct privacy-preserving distributed data analysis (PPDA) tasks to learn beneficial data models or analysis results. The field of privacy has seen rapid advances in recent years because of the increases in the ability to store data. In particular, recent advances in the data mining field have lead to increased concerns about privacy.
• It is often highly valuable for organizations to have their data analyzed by external agents. However, any program that computes on potentially sensitive data risks leaking in- formation through its output. Differential privacy provides a theoretical framework for processing data while protecting the privacy of individual records in a dataset.
EXISTING SYSTEM
• SECURE MULTIPARTY COMPUTATION• Definition:
In existing, we generally assume that participating parties provide truthful inputs. This assumption is usually justified by the fact that learning the correct data analysis models or results is in the best interest of all participating parties. If any party does not want to learn data models and analysis results, the party should not participate in the protocol.
PROPOSED SYSTEM
• The term incentive compatible means that participating parties have the incentive or motivation to provide their actual inputs when they compute functionality. Although SMC-based privacy-preserving data analysis protocols (under the malicious adversary model) can prevent participating parties from modifying their inputs once the protocols are initiated, they cannot prevent the parties from modifying their inputs before the execution. On the other hand, parties are expected to provide their true inputs to correctly evaluate a function that satisfies the NCC model. Therefore, any functionality that satisfies the NCC model is inherently incentive compatible under the assumption that participating parties prefers to learn the function result correctly and if possible exclusively. Now the question is which functionalities or data analysis tasks satisfy the NCC model.
ADVANTAGES IN PROPOSED SYSTEM
• Each of these deals with the problem of ensuring truthfulness in data mining. However, each one requires the ability to verify the data after the calculation.
• Although verification based techniques are very useful, there are cases where verification is not feasible due to legal, social and privacy concerns.
MODULES
• User Interface Design • Create Multiple Organizations • Data Analysis and Integration • Inputs computation model• Association Data Mining
Module Description• USER INTERFACE DESIGN:• In this module we create a user page using Graphical User
Interface(GUI), which will be the media to Connect user with the server and through which client can able to give request to the server and server can send the response to the client, through this module we can establish the communication between client and server using webpage.
• A program interface that takes advantage of the computer's graphics capabilities to make the program easier to use. Well-designed graphical user interfaces can free the user from learning complex command languages. On the other hand, many users find that they work more effectively with a command-driven interface, especially if they already know the command language. Its goal is to enhance the efficiency and ease of use for the underlying logical design of a stored program. Thus the user interacts with information by manipulating visual widgets that allow for interactions appropriate to the kind of data they hold. The widgets of a well-designed interface are selected to support the actions necessary to achieve the goals of the user.
Module Description(contd..)
• CREATE MULTIPLE ORGANIZATIONS:
This is second module of our project. Here we are design no. of
parties. Each and every party may have information to store their
database. All the parties may send their inputs to Data Analysis module.
Here all n no. of parties will send their inputs to single data analysis . The
data analysis will store their inputs either horizontal or vertical partitions.
In this module we can create no. of parties. Each and every party may
nave own data base it can store their information either vertical portion or
horizontal portion.
Module Description(contd..)
• DATA ANALYSIS AND INTEGRATION:
This is the third module of our project. Our Data Analysis designed
using cryptographic techniques. Data are generally assumed to be either
vertically or horizontally partitioned. In the case of horizontally partitioned
data, different sites collect the same set of information about different entities.
In the case of vertically partitioned data, we assume that different sites collect
information about the same set of entities. A party can store their input data
either vertical partition or horizontal partitioned. If parties choose horizontal
partition then the input data for many different individuals. Same way if
parties choose horizontal partition then the input data for many different
individuals.
Module Description(contd..)
• Inputs computation model
• This is fourth module of our project. This model to design for compute
all the truthful inputs of all participating parties here going to
assumptions like the first priority for every participating party is to learn
the correct result. Another one is, if possible, every participating party
prefers to learn the correct result exclusively.
Module Description(contd..)
• ASSOCIATION DATA MINING
• This is last module of our project. Our data mining is summarize the
association rule mining and analyze whether the association rule mining
can be done in an incentive compatible manner over horizontally or
vertically partitioned database. If get in the requested query then it
search where it is located either horizontal partition or vertical partition
retrieve the result from partition after that result send to particular party.
TECHNIQUE USED
ASSOCIATION RULEMINING ALGORITHM
The above definition simply states what function could be computed in NCC setting deterministically (i.e., computation result is correct with probability one), and no party could correctly compute the correct result once the party lies about his or her inputs in a way that changes the original function result. In other words, if a party i replaces its true input vi with v_ i and if f(v_ i, v−i) _= f(vi, v−i), then party i should not be able to calculate the correct f(vi, v−i) from f(v_ i, v−i). And vi. Note that strategy (ti, gi) means that the way the input is modified, denoted by ti, and the way the output is calculated, denoted by gi. In ti can be considered as choosing a value different from the actual input, and gi can be considered as the ways the correct μ and s2 are computed. Another implication of the above definition is that for any ti, the corresponding gi should be deterministic, because each party want to exactly compute the “correct” result.
• A two-party protocol is proposed to securely compute JC. The protocol consists of two stages
SYSTEM ARCHITECTURE
Parties
User login
DB
Validate
Data analysis
Vertical portion Horizontal potion
NCC Model
TTP
Rule mining
System Architecture Description
• In above diagram contains client Login, Database, Work Allocation, Worker Page, Computing, Reposting, and Work Grouping. First computation node will start running. After party node enter user name and password that is validated by compatible node. Then computation node assigns the work to the data mining nodes. Data mining node finishes his work and reposted to the compatible node. TTP collects the inputs of parties and group of parties input for particular work presented by party nodes.
USE CASE DIAGRAM
input
private inputs TTP
compute the input data
party1
party2function over join the inputs
party3
vertical portion
NCC model
horizantal portion
Data mining
CLASS DIAGRAM
Party's
createdatastore
create()database()vertiacal()horizanta()
input computative model
updatedatacompute
receive()compute()correct()exclusive()
NCC_Mpdel
createinputdatastoreretrieve
inputs()datamining()ttp()vertical()horizantal()
Data Mining
storeupdateinput
horizantal()vertical()compute()
SEQUENCE DIAGRAM
parties data analysis NCC Model Rule mining
to store data
either vertical or horizantal
sending the inputs
all the inputs are compute
diff inputs og parties stored
sending requested data to NCC
response
ACTIVITY DIAGRAM
parties
NCC Model
Data Ming
vertical portion horizantal portion
LOGIN FORM
Organization Login
ORGANIZATION’S INFORMATION:
Participating parties:
Data Sharing:
Conclusion
• Even though privacy-preserving data analysis techniques guarantee that nothing other than the final result is disclosed, whether or not participating parties provide truthful input data cannot be verified. In this paper, we have investigated what kinds of PPDA tasks are incentives compatible under the NCC model. Based on our findings, there are several important PPDA tasks that are incentive driven. Table II classifies the common data analysis tasks studied in this paper into DNCC or Non-DNCC categories. Most often, data partition schemes can make a difference in determining DNCC or Non-DNCC classifications.