Building the Data Layer With Entity Framework 4.0

Building the data layer in a .NET 4. 0 application with

Entity Framework

Mihai Tătăran

General Manager, H.P.C. Consulting

Microsoft Most Valuable Professional, ASP.NET

http://www.hpc-consulting.ro/index.php/blog/

http://www.codecamp.ro

http://www.hpc-consulting.ro/index.php/blog/

http://www.codecamp.ro/

1. Introduction..................................................................................................................................... 4

1.1. Who is this book for ................................................................................................................. 4

2. The data layer with an O/RM ........................................................................................................... 4

2.1. What is an Object/Relational Mapper....................................................................................... 4

2.2. Why building the data layer with an O/RM ............................................................................... 4

3. Introducing Microsoft Entity Framework .......................................................................................... 5

3.1. The first application with Entity Framework ............................................................................. 5

Setting up the demo ........................................................................................................................ 6

Getting some data from the database ............................................................................................ 10

Adding data to the database .......................................................................................................... 12

3.2. Entity Framework explained ................................................................................................... 13

Entities .......................................................................................................................................... 13

ObjectContext................................................................................................................................ 14

Working with data ......................................................................................................................... 14

3.3. Common scenarios ................................................................................................................. 14

Loading references (child Entities). Lazy versus Eager Loading ....................................................... 15

Delayed query execution ............................................................................................................... 17

Inserting real life data .................................................................................................................... 18

Updating real life data ................................................................................................................... 20

Adding computed properties to an Entity....................................................................................... 22

Resolving concurrency issues ......................................................................................................... 23

Mapping stored procedures to the model ...................................................................................... 28

POCO Entities ................................................................................................................................ 33

Many to many entities ................................................................................................................... 39

Inheritance .................................................................................................................................... 41

Table per hierarchy (TPH)........................................................................................................... 41

Table per type (TPT) ................................................................................................................... 41

Table per concrete class (TPC) .................................................................................................... 41

Choosing between strategies ..................................................................................................... 41

Samples ..................................................................................................................................... 42

TPH sample ................................................................................................................................ 42

TPT sample ................................................................................................................................ 46

TPT with POCO entities sample .................................................................................................. 50

4. Advanced topics in Entity Framework ............................................................................................ 51

4.1. Transactions ........................................................................................................................... 51

4.2. Audit and logging with Entity Framework ............................................................................... 51

4.3. Improving performance.......................................................................................................... 51

4.4. Encrypting data in the database ............................................................................................. 51

4.5. Using Entity Framework in different architectures .................................................................. 51

1. Introduction

1.1. Who is this book for This book is created as a guide on how to build the data layer using Entity Framework in Microsoft.NET

4.0, targeting programmers with experience in Microsoft .NET technologies.

2. The data layer with an O/RM

2.1. What is an Object/Relational Mapper There are many possibilities for creating the data layer in a .NET application. O/RM tools represent one

of these possibilities, and there are many technologies which can be used to accomplish this goal. I will

enumerate a few: Entity Framework, nHibernate, Data Objects, etc.

Generally, an O/RM is a technology which helps you transform data from the relational form to an

objects form. In the database you typically store data in a relational manner (tables, views, stored

procedures, etc), and in an application you have data represented as objects and collections of objects.

Thus, the correspondent of a table from the relational world is a class from the objects world; the

correspondent of a record in a table from a database is an instance of an object in the application; and

so on.

O/RM tools and technologies help you mainly by automating the processes of getting data from the

database and storing it in your objects, and the other way around. They usually come with a designer

which enables you to create the object entities very fast based on the database schema, or the other

way around.

2.2. Why building the data layer with an O/RM You should use an O/RM for your application when the application is large enough. I know, “large

enough” is very ambiguous but I want to emphasize that applications which take much time to build and

maintain, worth spending some time building them in an Object Oriented manner.

As I said, there are other possibilities for building the data layer in a .NET application: (1) creating all the

code from scratch, maybe using some data layer patterns like Table Data Gateway

(http://martinfowler.com/eaaCatalog/tableDataGateway.html) or Active Record

(http://www.martinfowler.com/eaaCatalog/activeRecord.html), etc.; (2) using Microsoft Enterprise

Library – Data Access Application Block (http://msdn.microsoft.com/en-us/library/ms954836.aspx) –

which actually is a wrapper above the ADO.NET methods; (3) using LINQ to SQL

(http://msdn.microsoft.com/en-us/library/bb425822.aspx) – which is a nice technology for a data layer,

but is often thought about as an O/RM and actually is not.

Using an O/RM gives you the advantage of treating data as instances of classes, like business-related

entities. For example when building an online shop, you will have entities (classes) like Product,

ProductCategory, Order, etc. Thus, placing an order in the system implies creating an object of type

http://martinfowler.com/eaaCatalog/tableDataGateway.html

http://www.martinfowler.com/eaaCatalog/activeRecord.html

http://msdn.microsoft.com/en-us/library/ms954836.aspx

http://msdn.microsoft.com/en-us/library/bb425822.aspx

Order and associating it with an instance of an existing Product, and storing it to the database.

Something like:

Order o = new Order();

o.Product = context.Product.Where(p.Name == “BMW”).FirstOrDefault();

context.AddToOrder(o);

context.SaveChanges();

When working in a relational fashion, placing an order means creating an ADO.NET command which

contains an INSERT statement to the Order table with an unique identifier from the Product table as the

product related to the order. Something like:

SqlCommand com = new SqlCommand(“INSERT INTO Order(..., ProductId)

VALUES(..., “ + productId + “)”, sqlConnection);

sqlConnection.Open();

com.ExecuteNonQuery();

sqlConnection.Close();

The first approach is more intuitive and thus easier and cheaper to maintain on the long run.

3. Introducing Microsoft Entity Framework Microsoft Entity Framework was introduced in the 3.5 SP1 version of .NET in 2008. It’s a new technology

for accessing data, but classic ADO.NET is still supported in .NET 4.0 and will continue to be supported in

future versions.

Besides the general advantages of building the data layer with an O/RM, Entity Framework has a few

more: (1) you don’t need to worry about the underlying database schema, (2) it integrates easily with

other Microsoft technologies like Windows Communication Foundation, Silverlight, ASP.NET and so on.

3.1. The first application with Entity Framework All the samples in this book will be created on top of the same database, called

DataLayerEntityFramework. The initial structure is presented in the following diagram:

So, we have a database which keeps tracks of Publications (Books or Videos) which are created by

Authors and which may have additional Resources.

Setting up the demo

For a simple demo, just to see how to make a basic use of Entity Framework, we will create a console

application. I will use Visual Studio 2010 Beta 2 for all my examples. The application called

01_FirstApplication can also be found in the companion code archive and was created following these

steps:

1. We have a console application created by the Visual Studio 2010 template.

2. We right click on the project, select Add -> New Item and chose ADO.NET Entity Data Model to

add a new edmx file to our project:

AuthorId

Name

Address

PublicationId

Title

AuthorId

IsBookOrVideo

PublicationResourceId

ResourceName

ResourceUrl

PublicationId

3. A wizard for creating the Entity Data Model starts. Chose “Generate from database” and click

Next.

4. In the Choose Your Data Connection screen, select the connection to your database server

which has the DataLayerEntityFramework database attached. After clicking on “New

Connection...”, enter the server name, then select the database and click Ok:

5. Check “Save entity connection settings in App.Config as:” and click Next.

6. Here is the step where we choose which objects from the database should appear in the Entity

data Model of our application. In our simple example, we only use the tables in the above

diagram, so only check them. Please notice the checkbox “Pluralize or singularize generated

object names” which is new to .NET 4.0 and which enables the plurals or singulars on objects:

7. After selecting the tables, click Finish. Now Visual Studio creates:

a. An edmx file, which is basically an XML file but Visual Studio has a designer for it.

b. Ad .designer.cs file – which contains the C# code for our entities.

Let’s spend a little time looking at the generated diagram. First, we see the 3 entities corresponding to

the 3 tables in our database. One very important aspect, which underlines the role of an O/RM, is how

relationship between entities are represented. Take for example the two tables from our database:

Author and Publication. A Publication has an Author, and we represent this by a Foreign Key – which is

the way of representing relationships in a relational data store. In our entities’ model, an object of type

Author has a collection of objects of type Publication (called Publications) and an object of type

Publication contains an object of type Author.

The Visual Studio Entity designer supports UML and is totally different from the older Class Diagram

designer. You can observe the Mapping Details section when into the designer, and also the Properties

available for entities and for properties of our entities. For example, if you click on Publication in the

designer, you will see these properties: Abstract = False, Access = Public, and so on – all referring to the

Publication class. If you select the Title property of Publication class, you see its properties, e.g. Nullable

= False (as it is set in the database) and so on.

Getting some data from the database

Now let us make some simple actions with the data. We will query the database for the list of available

Publications. In order for us to work with the entities and the supporting database, we need an instance

of a class of type ObjectContext which has been generated by the Visual Studio designer when we

created the model. In our case, this class is called DataLayerEntityFrameworkEntities, as you can check it

in the Model1.Designer.cs. An instance of an ObjectContext keeps all the communication with the

database and the references to all the loaded objects, and we will explain it more clear in the following

section.

So, in Program.cs – Main method, we put this code:

static void Main(string[] args) {

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.ToList(); foreach (Publication pub in publications) { Console.WriteLine(pub.Title); } }

We declare a list of Publication to store the query results into it. We get the publications by using the

context and its Publications property (a kind of queryable collection), but we need to apply the ToList()

method in order to create an actual query to the database and execute it and return the data. If you run

this program, you should get a result similar to this:

Now let’s play a bit with this code. Say we want to see not only the Publication’s Title, but also it’s

Author’s Name. We change the line which prints to the console as the following:

Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name);

Then we run the application and ...

Now, at this point a reader, who worked with Entity Framework from version 3.5 SP1, might be a little

surprised as I initially was. This is an improvement of .NET 4.0, which automatically loads the pub.Author

object even if we never say it explicitly. What happens in our case, is a Lazy Loading mechanism: we first

load the Publications (a query only to Publication table), then in the foreach loop at the request for

pub.Author, a new query is made behind the scenes to get the Author for the specific publication.

Lazy Loading is one option regarding references of entities. We have the other one, called Eager

Loading, which makes a query on all specified tables (performs a SQL JOIN actually). So let’s change our

code to this:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.Include("Author").ToList(); foreach (Publication pub in publications) {

Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name); }

The result of executing the code above is the same as the previous one. The only difference is how

queries are made in the database. So, i’ll recap here:

1. In case one (without any Include(...)), context.Publications.ToList() performs a query on the

Publication table. Then, pub.Author.Name performs a query on the Author table with a WHERE

clause.

2. In the second case, a SELECT with JOIN is performed on Publication and Author tables.

Adding data to the database

Now let’s say we want to add a new Publication to the database. We can do this by:

1. Creating a new object of type Publication.

2. Set all of the mandatory properties. These include setting an Author.

3. Save the new object to the database.

Here is the code for this scenario:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication newPub = new Publication() {

Title = "New Title", IsBookOrVideo = true, Author = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault() }; context.AddToPublications(newPub); context.SaveChanges(); // check if the new publication is in the database foreach (Publication pub in context.Publications.ToList()) {

Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name); }

I have to explain a few things. Setting the Author property is done by first querying the database for an

Author with the name “Jane Doe” and if a result is present we assign it to the property. The Where(...)

method is an extension method present in LINQ (language integrated query) used to query the data

against which is run. In our case, we query entities so we are talking about LINQ to Entities here. The

FirstOrDefault() method returns the first element which matches the query or null in case of a void

result.

After we have the new Publication instance, we have to assign it to the ObjectContext – so it can keep

track of it from now on. At this point, our context knows about a new Publication but only in memory. It

will be added to the database the next time we call SaveChanges() on the context. Thus, it’s possible to

make many operations with the context objects in memory (add new objets, update or delete existing

ones) and at the point we call SaveChanges() a batch of operations (actually a Transaction) is sent to the

database.

And here is the result:

This is it for now. In the following sections I will explain how Entity Framework works and which are

some of the most common scenarios.

3.2. Entity Framework explained In this section I will explain what the underlyings of Entity Framework are.

Entities

An “entity” in Entity Framework is an item (class) which is described in a edmx file and we can see in the

designer. All objects which contain data from our model are instances of these entities. What makes an

entity:

1. Data properties – properties which contain the data, and which correspond to the table

columns.

2. Reference properties – objects or collections of objects which represent the references from the

database. In our example called “01_FirstApplication”, the Publication entity has a property

called AuthorReference of type EntityReference<Author>, and Author entity has a property

Publications of type EntityCollection<Publication>.

3. They don’t have behavior – like any other business entities, apart from the methods used to

keep track of entities in the ObjectContext.

ObjectContext

Is the primary class for interacting with entities. An instance of this class is needed in order to work with

the database underneith, as it encapsulates:

1. A connection to the database, which is managed transparently for the programmer, but which

can also be managed directly.

2. Metadata for the entities model.

3. An ObjectStateManager object that keeps the entities in the cache.

Working with data

With the help of Entities and the ObjectContext, we can work with data on the application side, in

memory, without having to interact with the database any time we want to change our data but only

when we want to presist it. Of course, in different architecture types we will have different strategies of

managing the ObjectContext (so the data cache) and sometimes it won’t be optimal to cache the data in

memory but we will prefer a stateless ObjectContext. More on this subject, in the chapter Using Entity

Framework in different architectures.

Getting data ultimately means creating a query and running it against a database. We do not explicitely

create a SQL command (as we were used to do it in ADO.NET 2.0) but use the entities and

ObjectContext. Here is the example we saw in the previous section:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.ToList();

The instance of ObjectContext, context needs to be created first. After this instantiation we can work with data as long as this object exists. The next line actually creates and executes the query against the database, like this:

1. When we call context.Publications we tell Entity Framework to prepare a query. Of course, we can apply extension methods like Where(...) and in this case the query becomes more specific.

2. When we call ToList() or any method which returns entities or collections of entities (like First(), FirstOrDefault()) the query prepared in step 1 is sent to the database server.

3. The resulted data is sent back to Entity Framework, via the connection to the database, and the relational structure is automatically transformed into entities. At this point, data is loaded in the ObjectContext cache and will continue to exist there as long as the context is alive.

3.3. Common scenarios Out of experience with Entity Framework I have seen some scenarios of usage which are very common.

Almost every EF enabled application will use most of the scenarios listed in this section, and there may

be others which I didn’t mention here.

To exemplify these scenarios, I will use the application “02_CommonScenarios_1” from the companion

code archive, with the same database “DataLayerEntityFramework”.

Loading references (child Entities). Lazy versus Eager Loading

Our sample application already has the model of entities: Author, Publication, PublicationResource. We

will next load an Author together with all his Publications and all of the PublicationResources attached

to them. If you remember the first of our examples, a bit earlier, there are 2 options regarding

references loading: Lazy Loading and Eager Loading.

Here is the code which does what I mentioned in a Lazy Loading fashion:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault(); Console.WriteLine(author1.Name); foreach (Publication pub in author1.Publications) {

Console.WriteLine("\t" + pub.Title); foreach (PublicationResource pubRes in pub.PublicationResources) { Console.WriteLine("\t\t" + pubRes.ResourceName);

} }

The result after executing the application:

We may not want EF to perform this way, since it’s not very performing. I mean, if you were to run the

application with the debugger, and you will trace the commands executed on an MS SQL Server instance

with SQL Server Profiler, you will notice which are the queries sent to the database engine.

1. Put a breakpoint on the first Console.WriteLine statement, immediately after loading the

author. Then look in the Profiler and you see that the command excuted only queries the Author

table.

2. Put a breakpoint on the next Console.WriteLine statement, inside the first foreach. Run the

application up to that breakpoint, and look into the Profiler to see that another query is

performed, thi time looking for all the Publications for a specific AuthorId.

3. Then put a breakpoint to the last Console.WriteLine statement, and skip directly to it. Look in

the Profiler and see that another query is done, which ultimately selects all

PublicationResources for a PublicationId which belongs to an AuthorId.

We can see that for every Author (in our case is only one) a query is made to get his Publications, and

then for every Publication, a query is done to get its PublicationResources. If we had 1000 Authors, each

with 100 Publications, and each Publication with 20 resources - you can imagine the time spent to get

that result we were looking for.

We can achieve this same result, by eagerly loading all the references. Here is the code for this option:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors

.Include("Publications").Include("Publications.PublicationResources") .Where(a => a.Name == "Jane Doe") .FirstOrDefault();

Console.WriteLine(author1.Name); foreach (Publication pub in author1.Publications) {

Console.WriteLine("\t" + pub.Title); foreach (PublicationResource pubRes in pub.PublicationResources) { Console.WriteLine("\t\t" + pubRes.ResourceName);

} }

Please notice the Include() methods, and their parameters. For getting child entities in a cascading

manner: Author->Publication->PublicationResource, starting from Author we have to include all

relationships. The first one is called “Publications” – notice the plural used here, as it’s a difference from

EF in .NET 3.5 SP1. The next reference is specified by it’s full name “Publications.PublicationResources”.

Only one query is executed against the database, which performs a join between all three tables

involved: Author, Publication, PublicationResource. It is quite large, but I included it here:

Delayed query execution

The moment of query execution in the database is important, and can be specified by us depending on

how we write our EF code. Let’s take this example:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Author> authors = context.Authors.ToList(); foreach (Author author in authors) {

Console.WriteLine(author.Name + "\t" + author.Address); }

The query is executed in the database engine immediately after calling ToList() method, and after that

we have the authors object loaded with data. Now let’s look at this piece of code:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); var authors = context.Authors; foreach (Author a in authors) {

Console.WriteLine(a.Name + "\t" + a.Address); }

In this case, when the program execution hits the foreach statement no query has been submitted to

the database engine yet. A query has only been created by EF inside the anonymous object “authors”

and is waiting to be used, but the execution only takes place when first referencing “authors”. This

option of delaying query execution is very helpful when considering performance, and we will look

deeper into this aspect later on.

Inserting real life data

We will first add a new Publication into the database. We have to:

1. Create a new object of type Publication

2. Set all of its mandatory properties. They are mapped to the non-nullable columns in the

database, so it’s easy to see the mandatory columns.

3. A specific case of mandatory properties are the references. In our case, we must set an Author

for the new Publication.

4. We add the new object instance to the context, using one of the extension methods created by

EF, specific to the object type. In our case, we call AddToPublications() method.

5. Only when we want to persist the changes in the database we need to call SaveChanges()

method on the context. Until then, anyway, our object can be used from the memory – the

context ObjectCache.

So, this is a sample code for adding a new Publication for a specific Author, “Jane Doe”:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication pub = new Publication(); pub.Author = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault(); pub.IsBookOrVideo = true; pub.Title = "Developing web applications with Silverlight 3.0"; context.AddToPublications(pub); context.SaveChanges(); // checking the new title in the database foreach (Publication p in context.Publications) {

Console.WriteLine(p.Title); }

In this sample, when setting the Author of the new Publication we first make a query against the

database to get a specific Author. We will see further on that there is another way of setting the

Publication’s Author, without making a suplimentary call to the database.

Here is the result after executing the code:

Another possibility to set the Publication’s Author is to set the AuthorId property, without calling the

database. In real-life situations, we may not want to make an extra call to the database if we already

know the unique identifier of the Author we want to set to our Publication.

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication pub = new Publication(); pub.AuthorId = 2; pub.IsBookOrVideo = true; pub.Title = "Developing web applications with Silverlight 4.0"; context.AddToPublications(pub); context.SaveChanges(); // checking the new title in the database foreach (Publication p in context.Publications) {

Console.WriteLine(p.Title); }

Notice that instead of setting the Author property of our new Publication, we set the AuthorId property.

Now, I know this is not an “Object Oriented” approach, it’s more like a data-oriented approach, but I

don’t see any problem using this kind of “shortcuts” if they help performancewise. Of course, the

limitation of this case is that we already know the Id of the Author we want to set.

Here is the result:

Updating real life data

Let’s say we want to change the Title of a Publication from our database. Here is the simplest code we

can write in order to achive the goal:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); // display all existing publications foreach (Publication p in context.Publications) {

Console.WriteLine(p.Id + "\t" + p.Title); } Console.WriteLine("Enter the Id and Title for updating the chosen Publication:"); Console.Write("Id: "); int pubId = Convert.ToInt32(Console.ReadLine()); Console.Write("Title: "); string pubNewTitle = Console.ReadLine(); Publication pub = context.Publications.Where(p => p.Id == pubId).FirstOrDefault(); pub.Title = pubNewTitle; context.SaveChanges(); // checking the updated title in the database foreach (Publication p in context.Publications) {

Console.WriteLine(p.Id + "\t" + p.Title); }

Please take a look at what we do in the code above:

1. We first list all Publications.

2. The application requests entering the Id of a Publication to be modified. Then, it requests the

new Title.

3. A query is made to select the Publication with the given Id.

4. We change the Title property on the object.

5. We save the changes to the database.

6. We list again the Publications to check that everything worked ok.

After running the code:

A problem I want to mention is related to step 3: in order to select the Publication we want to change,

we make a query to the database. Maybe there is no need to do that, especially that we have already

loaded all the Publications before, to list them, and I said before that the context manages the

ObjectCache – it actually remembers the objects which have been loaded before. Another way to

accomplish the same thing is to replace FirstOrDefault() method with another one, which does not imply

a call to the database but a call to the ObjectCache, and only if the searched object cannot be found

there it will go to the database for it.

So, in the code above, we replace this line:

Publication pub = context.Publications.Where(p => p.Id == pubId).FirstOrDefault();

With this line:

Publication pub = (Publication)context.GetObjectByKey(new System.Data.EntityKey( "DataLayerEntityFrameworkEntities.Publications", "Id", pubId));


GetObjectByKey() needs a parameter of type EntityKey, which represents the way EF uniquely identities

objects. The ObjectContext can return an object based on its EntityKey, which is formed by: the entity

set name (fully qualified), the key name (the name of the property / column which uniquely identifies

the objects of the given type), and the value of the unique property / column.

So, GetObjectByKey() first looks into the ObjectCache for an object searching it by the EntityKey. If it

finds it there, it returns it, if not – a call is made to the database. This approach is more performant then

using FirstOrDefault(), but does not guarantee that the object it returns has the latest state: it is possible

that between the first load of the object and the GetObjectByKey() call, the values in the database have

been changed.

Adding computed properties to an Entity

Computed properties are properties of our entities which are computed at runtime, and do not map on

a specific column in the database. An example would be: we would like to have a string property on

Author which contains both his name and address, let’s call it Contact. Now, there is no way we can add

this property in the Entity Designer, since EF forces us to map all properties to columns from the

database.

A workaround is to have a new file for the partial class Author, where we add this new property:

partial class Author {

public string Contact {

get { return Name + "\t" + Address;

} }

}

Now we can write a piece of code which shows the value of this new Property:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author a = context.Authors.FirstOrDefault(); Console.WriteLine(a.Contact);

Resolving concurrency issues

In a client-server scenario, where you have the ObjectContext running on a client application (console,

windows, Silverlight, etc) and the database on a shared database server, concurrency issues may appear.

Let’s exemplify this:

1. We have User1 who starts an application which lists all Authors.

2. We have User2 who does the same.

3. User1 changes the Name of the Author with Id=1. The changed value goes into the database.

a. Id = 1; Name = “Changed by User1”; Address = “New York”

4. User2 changes the Address of the same Author with Id=1. The updated Author is persisted into

the database, but because it is taken from memory – the old Name is persisted thus overriding

the changes of User1.

a. Id = 1; Name = “John Smith”; Address = “Changed by User2”

Here is the sample code for this scenario:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors.Where(a => a.Id == 1).FirstOrDefault(); Console.WriteLine(author1.Name + "\t" + author1.Address); Console.Write("New Name: "); string name = Console.ReadLine(); Console.Write("New Address: "); string address = Console.ReadLine(); author1.Name = name; author1.Address = address; context.SaveChanges();

We have to start 2 instances of the application. In instance 1, we enter a new Name for the Author

keeping the same Address, and we save. In instance 2, we keep the Name and enter a new Address.

Then we look into the database and see that the changes from instance 1 have been overridden by the

ones from instance 2.

We enter data in parallel:

We hit Enter in the instance above, so it finishes execution and saves a new Name in the database:

Then we hit Enter on the second instance, which finishes execution and saves its data:

Now we take a look into the database:

We notice that the Author with Id=1 has the values from the second instance of our application, so the

changes performed in instance 1 were overwritten.

Maybe this is the behavior we want, but sometimes we might need to know when somebody tries to

update some objects in the database but has an old version of it. EF gives us a very simple way of

resolving this concurrency issue: in our Entity Designer, we can select the properties of an entity whom

we want to watch for concurrency issues, and change their Concurrency Mode attribute to Fixed:

We apply this change for both Name and Address properties of Author. Then we run the two

applications again, and repeat the steps above to get this result:

We can see that on the second update we get an exception, which basically tell us that our data is out of

sync with the database. EF is able to tell this, because of the way it builds the command which is

supposed to update the Author. Here is the command sent to the database server:

We can see that there is a WHERE clause which not only searches for the Id of the Author, but also for

all fields which have been set as Concurrency Mode = Fixed: Name and Address. Since this command

does not find an Author (the Name has been already changed by another application instance), it

returns this to EF which throws an exception.

This exception can be caught, and inside the catch block we can chose what to do. For example:

try {

context.SaveChanges(); } catch (OptimisticConcurrencyException ex) {

context.Refresh(System.Data.Objects.RefreshMode.StoreWins, author1); }

In this case, our in-memory Author is refreshed with the newest values from the database. From now

on, we may choose to do anything we’d like, for example try again to perform the update which will

work because our object is in sync.

Mapping stored procedures to the model

The sample application used for this section is found in the folder called “02_CommonScenarios_2”.

Very often some of our logic needs to stay at the database level, say inside some stored procedures.

There are many situations when we would prefer to use stored procedures instead of classic procedural

code (business logic code), and some examples would be: encrypting and decrypting data, working with

geospatial data types, or simply choosing to do all the CRUD (Create/Read/Update/Delete) operations at

the database level to leverage the power of a database engine in terms of performance.

We use the same database “DataLayerEntityFramework”, with 3 tables Author, Publication and

PublicationResource. For a simple scenario, let’s say we want to create a Stored Procedure which gets all

the Publication of a specific Author, and we want to execute that Stored Procedure using our Entities

model.

Here are the steps to accomplish our goal:

1. We create the stored procedure. The database backup in the example folder already contains a

stored procedure called “[dbo].[Get_Publications_ForAuthor]”, which only contains this

statement and an input parameter of type int:

SELECT * from Publication where AuthorId = @AuthorId

2. In Visual Studio, inside the edmx designer, we right click and select Update Model From

Database.

3. In the Update Wizard, in the Add section we check the Stored Procedure and then click Finish

and save our edmx file:

4. Inside the edmx designer, we right click, and select Add -> Function Import.

5. We complete the name of the function, select the stored procedure from the list, and select the

return type which is a Collection of Publication.

6. To test the function import, write the following code inside the Main method:

DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> pubs = context.GetPublicationsForAuthor(1).ToList(); foreach (Publication pub in pubs) {

Console.WriteLine(pub.Title); }

7. And here is the result:

We saw that using Stored Procedures from Entity Framework is quite simple. Actually, this was even

possible in the first version of EF which comes with .NET 3.5 SP1, but that one had some limitations. One

of the most important limitations was the inability of EF to map a function import on a Stored Procedure

which does not return a result composed of defined Entities. For example, if we wanted our stored

procedure above to get not only the Publications but also their PublicationResources, in the previous

version of EF it wouldn’t be possible.

So, EF from .NET 4.0 provides the ability to map the resultset coming from a stored procedure to a new

complex type, which is not necessarily an Entity in the model. Here are the steps to map such a stored

procedure to the model and use it:

1. We have a stored procedure called “GetPublicationsAndResourcesForAuthor” which looks like

this:

Select p.*, pr.ResourceName, pr.ResourceUrl from Publication p

inner join PublicationResource pr on p.Id = pr.PublicationId

where p.AuthorId = @AuthorId

2. We update our model to include the new stored procedure.

3. We right click on the edmx designer to add a Function Import, choose the new stored

procedure, click on the Get column Information button to get the resultset schema, and then

click on Create New Complex Type to create a new type which will store the result:

4. The stored procedure is now mapped into our model, and will return a collection of the new

type PublicationWithResource. Here is the test code:

List<PublicationWithResource> pubsRes = context.GetPublicationsAndResourcesForAuthor(2).ToList();

foreach (PublicationWithResource pr in pubsRes) {

Console.WriteLine(pr.Title + "\t\t" + pr.ResourceName); }

5. And here is a possible result:

POCO Entities

POCO stands for Plain Old CLR Object – an object with nothing particular but some properties defined on

it. EF 3.5 did not give us the possibility to create POCO entities. Our entities were restricted by some

conditions: any EF entity must inherit from EntityObject, or implement some EF specific interfaces -

which made heavy the task of creating domain classes independent of any persistence concerns. EF 4.0

however gives us the possibility to create types which do not need to inherit from EntityObject or

implement any of the interfaces.

When using POCO entities, we benefit from all functions of Entity Framework, as if we were using

EntityObject – enabled entities, but everything is much simpler.

For this section, we will use the sample application and database in folder “02_CommonScenarios_3”.

The application has already defined the model and the POCO entities, which was done following these

steps:

1. In a new Console Application project, right click and select Add -> New Item then choose

ADO.NET Entity Data Model.

2. Walk through all the wizard of creating the model.

3. At the moment when Visual Studio opens the edmx designer, before the .Designer.cs file is

created with some generated code, change one property in the tool, Code Generation = None:

4. Save the edmx file and notice that the .Designer.cs file is empty (actually it has an explanatory

comment).

At this point, we have the edmx file (model, mappings, etc.) but we do not have our entity classes, so we

can begin writing our own POCO entities. Here it is:

1. We add a new class to the project, named Author:

public class Author {

public int Id { get; set; } public string Name { get; set; } public string Address { get; set; } public List<Publication> Publications { get; set; } }

2. We add a new class to the project, named Publication:

public class Publication {

public int Id { get; set; } public string Title { get; set; } public bool IsBookOrVideo { get; set; } public int AuthorId { get; set; } public Author Author { get; set; } }

3. We create our ObjectContext class, in order to be able to manage the entities:

class DataLayerContext: ObjectContext {

public DataLayerContext() : base("name=DataLayerEntityFrameworkEntities", "DataLayerEntityFrameworkEntities") { _authors = CreateObjectSet<Author>(); _publications = CreateObjectSet<Publication>();

} public ObjectSet<Author> Authors { get { return _authors; } } private ObjectSet<Author> _authors; public ObjectSet<Publication> Publications { get { return _publications;

} } private ObjectSet<Publication> _publications; }

4. Now we have: the model, the POCO entities written as we wanted, and the ObjectContext class.

We can test that everything is working as it should by inserting this code in the Main method:

DataLayerContext context = new DataLayerContext(); Publication pub = context.Publications.FirstOrDefault(); Console.WriteLine(pub.Title);

5. And here is the result:

Mapping the POCO entities

You may be wondering how EF does the mapping and correlation between our POCO entities and the

metadata (edmx). In EF 3.5, this was achieved by attributes but now it’s based on convention: entity

type names, property names, complex type names used in our POCO classes must match those defined

by the conceptual model.

The demo application “02_CommonScenarios_3” includes some examples of simple operations (CRUD) –

I won’t go through all of them here, but see some differences from EF 3.5 and discuss the more

important aspects.

Fixing up relationships

For example, let’s say you want to add a new Publication to the database. One way to do that, as we

have already seen, is to create a new object of type Publication and set all its properties with desired

values, including the references. Another way to set the Publication’s Author, besides its AuthorId

property, is to simply add it to the collection of an existing Author:

Author auth1 = context.Authors.Include("Publications").FirstOrDefault(); auth1.Publications.Add(new Publication() {

Title = "New Pub Test", IsBookOrVideo = true });


The interesting aspect in the previous example is that EF automatically sets the reference between the

Author and the Publication.

Other POCO benefits

There are many benefits when creating our POCO entities and not using the old-style EF entities. I will

list some of them, but I am sure there are more:

1. Our POCO entities are simpler to read, because they do not carry extra stuff like inheritance

from EntityObject, attributes needed for mapping, etc.

2. Now it’s much simpler to add computed properties, because our classes are not designer-

generated but created by our hand.

3. In EF 3.5 I had this problem: I wanted to decorate some properties (which were mapped to

columns) with attributes, say for business rules validation. This was not supported, and I could

do this by some workarounds only (e.g. create a copy of the property, and use it from the

business logic perspective). Now we don’t have to worry about anything, just put our attributes

where we want them to be.

So, I changed the Address class above with the following, to include a computed property and some

custom attributes.

public class Author {

public int Id { get; set; } [PersonNameValid] public string Name { get; set; } public string Address { get; set; } public List<Publication> Publications { get; set; } public string Contact { get { return Name + "\t" + Address; } } }

So, please notice the computed property Contact – which is not mapped to any column in the database

and does not have a setter, and the “PersonNameValid” attribute used to validate user input. The

attribute and its logic is found in the “02_CommonScenarios_3” solution, and I don’t include that code

here because of space requirements.

Here is the test code for the custom property and the attribute:

DataLayerContext context = new DataLayerContext(); Author auth1 = context.Authors.Include("Publications").FirstOrDefault();

Console.WriteLine(auth1.Contact); auth1.Name = "Test/Not valid"; BusinessValidation.Validate(auth1); context.SaveChanges();

And the result is showing us the computed property value, and throws an exception because the name

entered is not valid:

Lazy loading with POCO entities

You already know that lazy loading is possible in Entity Framework, even in its first version. This comes

out of the box when creating our entities with the default code generation technique, but when we are

using POCO entities we must explicitly tell EF how to handle lazy loading.

There are two things we must do to have lazy loaded references on our POCO entities:

1. Declare the property which represents the reference as virtual.

2. Enable deferred loading on the hand-crafted context that we have.

Ok, so here is the following code which tries to print the Title of the first Publication of the first Author:

DataLayerContext context = new DataLayerContext(); Author auth1 = context.Authors.FirstOrDefault(); Console.WriteLine(auth1.Publications.FirstOrDefault().Title);

Here is the result:

To enable lazy loading on our entity, we change the Publications property in Author:

public virtual List<Publication> Publications { get; set; }

Then we enable lazy loading on the context:

context.ContextOptions.LazyLoadingEnabled = true;


So what does EF under the covers? When seeing a virtual property, EF provides a dynamic proxy

instance of our POCO entity, at runtime. In English: EF creates a dynamic class based on the POCO

entities, but with some other properties which permits it to perform lazy loading. Here is the auth1

variable at runtime – and notice its type:

Explicit loading

Though lazy loading is quite good in terms of performance tuning, you might want to load the navigation

properties at specific moments in time, and to be in complete control over the operation.

This is possible, because now with EF 4.0 you can explicitly load relationships on POCO entities when you

like.

Here is the sample code for explicitly loading a reference / navigation property:

Publication pub = context.Publications.Where(p => p.Id == 8).FirstOrDefault(); context.LoadProperty(pub, p => p.PublicationResources); foreach (PublicationResource pubRes in pub.PublicationResources) {

Console.WriteLine(pubRes.ResourceName); }

Generating the code for POCO entities

These POCO entities that we write on our own, do not always have to be written from scratch. We can

use some code generation templates, enabled by the tool called Text Template Transformation Toolkit

(or T4 in short).

Many to many entities

The demos for this section use the database called “Conferences” which has the following structure:

We have Conferences, Users, and all Users can participate to any conference. This is a Many to Many

relationship between User and Conference. When we want to create an object model for this database

using Entity Framework, we discover that EF is pretty smart about handling this type of relationships: it

only creates 2 entities – Conference and User, like this:

Of course, there is no need for an extra entity which would correspond to UserAtConference table as

long as UserAtConference does not have any other fields but the IdConference and IdUser – relationship

fields. So EF is smart and eliminates an unnecessary entity, but if UserAtConference had extra fields,

then EF would create an entity corresponding to this table.

If we want to add an existing User to an existing Conference, here is how to do it:

ConferencesEntities context = new ConferencesEntities(); Console.WriteLine("Add a User to a Conference"); Console.WriteLine("Username: "); string username = Console.ReadLine(); Console.WriteLine("Conference: "); string conferenceTitle = Console.ReadLine(); User userToAdd = context.Users.Where(u => u.Username == username).FirstOrDefault(); Conference confDest = context.Conferences.Where(c => c.Title == conferenceTitle)

.FirstOrDefault(); confDest.Users.Load(); confDest.Users.Add(userToAdd); context.SaveChanges();

If we want to add a new User to an existing Conference (which means saving the User to the database

and adding it to a Conference):

ConferencesEntities context = new ConferencesEntities(); User userNew = new User() {

Address = "Bucharest, Romania", Username = "mircea" }; Conference conf1 = context.Conferences.FirstOrDefault(); conf1.Users.Add(userNew);


Inheritance

In many real-life situations, we find ourselves in the place where we have hierarchies of entities, which

need to be stored in the database behind our application. Most of the time we would like to represent

such entities in a hierarchical manner in the application – we would have classes like User (abstract) and

UserAdmin and UserPublic which inherit from User – but in the database there are more possibilities to

store this data.

Table per hierarchy (TPH)

In this model, we store all the entities in one table, having a column or a set of columns as differentiator

between concrete types. For the example above, we would have one table User with a column

IsAdmin(bit) as differentiator. Of course, there are columns which are specific to a concrete type, and

these should be marked as nullable since the other types cannot have values for those columns.

Table per type (TPT)

Here the properties of the base type are stored in a table which is shared between the other “concrete”

tables, which only store specific data. For our example above, we have 3 tables: User – with all common

data, UserAdmin – with the properties specific to the Admins, and UserPublic – with the properties

specific to the Public type. The latest tables each have a foreign key relationship to the shared table, in

order to keep the unique id across types.

Table per concrete class (TPC)

In this model, there is one table for each concrete type, with a column for each property in that type.

For our situation: table Admin (Id, Username, Team) and table Public (Id, Username, Address). Of

course, we must have a way to determine the uniqueness of the Ids across these 2 tables – maybe we

don’t want to have a Public user with Id = 20, and an Admin user with the same Id = 20.

Choosing between strategies

In short, I would not recommend TPC, since it does not seem too intuitive at least for me. Another

limitation for TPC, is the lack of the Entity Framework designer’s support.

For TPH, the best argument is performance: we do not need to join 2 tables when we query the

database for data of a specific type. Personally, I use this strategy most of the time, but there are some

drawbacks which must be taken into consideration.

So, the TPT advantages over TPH are:

Flexibility – it is easy to add a new type to the model, since you only have to add a new table.

Data validation – the need to have nullable columns in TPH does not allow for validation at the

database level. In TPT, we simply do not store in the same table data from different types.

For a more detailed comparison between strategies, please read this blog post:

http://blogs.msdn.com/alexj/archive/2009/04/15/tip-12-choosing-an-inheritance-strategy.aspx, of Alex

James, a Program Manager working on the ADO.NET team at Microsoft.

Samples

The samples for this paragraph are found in the 02_CommonScenarios_Inheritance folder. Here is the

database diagram used in this paragraph:

The database contains 4 tables: one table called User for the TPH model and 3 more for the TPT model.

TPH sample

In the solution found in the 02_CommonScenarios_Inheritance folder, there are more samples. Let’s

first discuss the TPH model of storage, by describing the steps necessary to map our entities with the

User table:

1. Create the User table in the database. It contains all the columns, for all types of concrete users.

2. In Visual Studio, add a new Entity Data Model, and select the User table from the database.

3. The EF model in Visual Studio now contains an Entity called User, which is a 1:1 representation

of the User table. This is not what we need, but we must have 3 entities in our application

model: an abstract class User, and 2 concrete classes, UserPublic and UserAdmin namely. In

order to achieve this, we must manually create 2 entities in the model.

4. Add a new Entity in the model, and name it UserPublic and set it to inherit from User. For this, in

the edmx designer perform right click -> Add -> Entity:

http://blogs.msdn.com/alexj/archive/2009/04/15/tip-12-choosing-an-inheritance-strategy.aspx

5. Do the same for UserAdmin. Now the model should look like this:

6. If you Build the solution now, you will notice some errors. They are due to the fact that our

entities are not mapped properly. We still need to do some things in order to have everything

set.

7. Select the entity UserPublic and in the Mapping Details window, set it’s mapping to the User

table, and put the condition IsAdmin = false:

8. Do the same for UserAdmin: map it to User table and add the condition IsAdmin = true.

9. Now, our User class contains the IsAdmin property which is no longer needed, since the derived

classes use its value as a type differentiator. So, remove the IsAdmin property and change the

User class to Abstract: true in the Properties window:

10. Build the solution and you should see no errors.

11. Now, if we want to see the Public Users or the Admins, we need to write some code like the

following, using the OfType<>() extension method from the Entity Framework:

DataLayerEFInheritanceEntities context = new DataLayerEFInheritanceEntities(); List<UserAdmin> admins = context.Users.OfType<UserAdmin>().ToList(); foreach (UserAdmin admin in admins) {

Console.WriteLine(admin.Username + "\t" + admin.Team); }

12. Run the application, and you should see the users of type Admin:

TPT sample

This sample is also included in the 02_CommonScenarios_Inheritance folder and uses the same

database as the previous one. Here are the steps needed to create our model in a TPT scenario:

1. Add an Entity Data Model to the application, and select the 3 tables called User2, User2Admin,

User2Public. The edmx designer should show something like this:

2. Notice that EF mapped association relationships between our entities, but we need inheritance.

So, we will remove existing relationships and create new ones. For table User2Admin, delete the

association, then right click the entity -> Add -> Inheritance, select User2 as base type and

User2Admin as derived type:

3. Do the same for User2Public, and then the model should look like this:

4. If you build now, you will notice some mapping errors, because we should not still have the

UserId properties in the derived classes. They are not needed in the derived entities, since they

should inherit the Id property from the base type (remember: our Ids are unique among the

tables).

5. For both derived entities, we have to delete the UserId property, and then map the UserId

column from the underlying table to the inherited Id property:

6. The last step we must do is mark the User2 class as abstract. Now if we build, there are no

errors and our model is ready to be used. Here is a sample code to get the entities using our

model:

DataLayerEFInheritanceEntities1 conetxt = new DataLayerEFInheritanceEntities1(); List<User2Admin> admins = conetxt.User2.OfType<User2Admin>().ToList(); foreach (User2Admin admin in admins) {


7. And an expected result:

TPT with POCO entities sample

Since EF 4.0 introduced the possibility of creating our own POCO entities, I think this would make an

interesting sample. So, the scenario is Table per Type, but we write our own entities (see the POCO

paragraph for more details).

Before we start, I must warn you to create the edmx (EF model) in a separate application (found in the

02_CommonScenarios_Inheritance_Poco folder) – so to make sure that it’s the single edmx in an

assembly. When I first tried to make the demo for this paragraph I encountered the following error,

because my TPT POCO edmx was created in the same assembly as the other 2 models used for the

previous examples: “Mapping and metadata information could not be found for EntityType”.

So, in order to have your TPT strategy implemented using POCO entities, you have to follow these steps:

1. Create the model mapped to the 3 tables: User2, User2Admin, User2Public, but do not generate

the code (remember POCO entities from a previous paragraph!):

2. Perform all the needed steps to have the TPT entities model (as in the previous sample).

3. Create all the POCO entities by hand. You will have 3 classes: User2, User2Admin, User2Public –

with the corresponding properties.

4. Create your own ObjectContext.

5. Write a sample code to test everything:

InheritancePocoContext context = new InheritancePocoContext(); foreach (User2Admin admin in context.Users.OfType<User2Admin>()) {


(To come …)

4. Advanced topics in Entity Framework

4.1. Transactions

4.2. Audit and logging with Entity Framework

4.3. Improving performance

4.4. Encrypting data in the database

4.5. Using Entity Framework in different architectures

Building the Data Layer With Entity Framework 4.0

Documents

Transcript of Building the Data Layer With Entity Framework 4.0