Building the Data Layer With Entity Framework 4.0
-
Upload
victoria-ciobanu -
Category
Documents
-
view
97 -
download
2
Transcript of Building the Data Layer With Entity Framework 4.0
Building the data layer in a .NET 4. 0 application with
Entity Framework
Mihai Tătăran
General Manager, H.P.C. Consulting
Microsoft Most Valuable Professional, ASP.NET
http://www.hpc-consulting.ro/index.php/blog/
http://www.codecamp.ro
1. Introduction..................................................................................................................................... 4
1.1. Who is this book for ................................................................................................................. 4
2. The data layer with an O/RM ........................................................................................................... 4
2.1. What is an Object/Relational Mapper....................................................................................... 4
2.2. Why building the data layer with an O/RM ............................................................................... 4
3. Introducing Microsoft Entity Framework .......................................................................................... 5
3.1. The first application with Entity Framework ............................................................................. 5
Setting up the demo ........................................................................................................................ 6
Getting some data from the database ............................................................................................ 10
Adding data to the database .......................................................................................................... 12
3.2. Entity Framework explained ................................................................................................... 13
Entities .......................................................................................................................................... 13
ObjectContext................................................................................................................................ 14
Working with data ......................................................................................................................... 14
3.3. Common scenarios ................................................................................................................. 14
Loading references (child Entities). Lazy versus Eager Loading ....................................................... 15
Delayed query execution ............................................................................................................... 17
Inserting real life data .................................................................................................................... 18
Updating real life data ................................................................................................................... 20
Adding computed properties to an Entity....................................................................................... 22
Resolving concurrency issues ......................................................................................................... 23
Mapping stored procedures to the model ...................................................................................... 28
POCO Entities ................................................................................................................................ 33
Many to many entities ................................................................................................................... 39
Inheritance .................................................................................................................................... 41
Table per hierarchy (TPH)........................................................................................................... 41
Table per type (TPT) ................................................................................................................... 41
Table per concrete class (TPC) .................................................................................................... 41
Choosing between strategies ..................................................................................................... 41
Samples ..................................................................................................................................... 42
TPH sample ................................................................................................................................ 42
TPT sample ................................................................................................................................ 46
TPT with POCO entities sample .................................................................................................. 50
4. Advanced topics in Entity Framework ............................................................................................ 51
4.1. Transactions ........................................................................................................................... 51
4.2. Audit and logging with Entity Framework ............................................................................... 51
4.3. Improving performance.......................................................................................................... 51
4.4. Encrypting data in the database ............................................................................................. 51
4.5. Using Entity Framework in different architectures .................................................................. 51
1. Introduction
1.1. Who is this book for This book is created as a guide on how to build the data layer using Entity Framework in Microsoft.NET
4.0, targeting programmers with experience in Microsoft .NET technologies.
2. The data layer with an O/RM
2.1. What is an Object/Relational Mapper There are many possibilities for creating the data layer in a .NET application. O/RM tools represent one
of these possibilities, and there are many technologies which can be used to accomplish this goal. I will
enumerate a few: Entity Framework, nHibernate, Data Objects, etc.
Generally, an O/RM is a technology which helps you transform data from the relational form to an
objects form. In the database you typically store data in a relational manner (tables, views, stored
procedures, etc), and in an application you have data represented as objects and collections of objects.
Thus, the correspondent of a table from the relational world is a class from the objects world; the
correspondent of a record in a table from a database is an instance of an object in the application; and
so on.
O/RM tools and technologies help you mainly by automating the processes of getting data from the
database and storing it in your objects, and the other way around. They usually come with a designer
which enables you to create the object entities very fast based on the database schema, or the other
way around.
2.2. Why building the data layer with an O/RM You should use an O/RM for your application when the application is large enough. I know, “large
enough” is very ambiguous but I want to emphasize that applications which take much time to build and
maintain, worth spending some time building them in an Object Oriented manner.
As I said, there are other possibilities for building the data layer in a .NET application: (1) creating all the
code from scratch, maybe using some data layer patterns like Table Data Gateway
(http://martinfowler.com/eaaCatalog/tableDataGateway.html) or Active Record
(http://www.martinfowler.com/eaaCatalog/activeRecord.html), etc.; (2) using Microsoft Enterprise
Library – Data Access Application Block (http://msdn.microsoft.com/en-us/library/ms954836.aspx) –
which actually is a wrapper above the ADO.NET methods; (3) using LINQ to SQL
(http://msdn.microsoft.com/en-us/library/bb425822.aspx) – which is a nice technology for a data layer,
but is often thought about as an O/RM and actually is not.
Using an O/RM gives you the advantage of treating data as instances of classes, like business-related
entities. For example when building an online shop, you will have entities (classes) like Product,
ProductCategory, Order, etc. Thus, placing an order in the system implies creating an object of type
Order and associating it with an instance of an existing Product, and storing it to the database.
Something like:
Order o = new Order();
o.Product = context.Product.Where(p.Name == “BMW”).FirstOrDefault();
context.AddToOrder(o);
context.SaveChanges();
When working in a relational fashion, placing an order means creating an ADO.NET command which
contains an INSERT statement to the Order table with an unique identifier from the Product table as the
product related to the order. Something like:
SqlCommand com = new SqlCommand(“INSERT INTO Order(..., ProductId)
VALUES(..., “ + productId + “)”, sqlConnection);
sqlConnection.Open();
com.ExecuteNonQuery();
sqlConnection.Close();
The first approach is more intuitive and thus easier and cheaper to maintain on the long run.
3. Introducing Microsoft Entity Framework Microsoft Entity Framework was introduced in the 3.5 SP1 version of .NET in 2008. It’s a new technology
for accessing data, but classic ADO.NET is still supported in .NET 4.0 and will continue to be supported in
future versions.
Besides the general advantages of building the data layer with an O/RM, Entity Framework has a few
more: (1) you don’t need to worry about the underlying database schema, (2) it integrates easily with
other Microsoft technologies like Windows Communication Foundation, Silverlight, ASP.NET and so on.
3.1. The first application with Entity Framework All the samples in this book will be created on top of the same database, called
DataLayerEntityFramework. The initial structure is presented in the following diagram:
So, we have a database which keeps tracks of Publications (Books or Videos) which are created by
Authors and which may have additional Resources.
Setting up the demo
For a simple demo, just to see how to make a basic use of Entity Framework, we will create a console
application. I will use Visual Studio 2010 Beta 2 for all my examples. The application called
01_FirstApplication can also be found in the companion code archive and was created following these
steps:
1. We have a console application created by the Visual Studio 2010 template.
2. We right click on the project, select Add -> New Item and chose ADO.NET Entity Data Model to
add a new edmx file to our project:
AuthorId
Name
Address
PublicationId
Title
AuthorId
IsBookOrVideo
PublicationResourceId
ResourceName
ResourceUrl
PublicationId
3. A wizard for creating the Entity Data Model starts. Chose “Generate from database” and click
Next.
4. In the Choose Your Data Connection screen, select the connection to your database server
which has the DataLayerEntityFramework database attached. After clicking on “New
Connection...”, enter the server name, then select the database and click Ok:
5. Check “Save entity connection settings in App.Config as:” and click Next.
6. Here is the step where we choose which objects from the database should appear in the Entity
data Model of our application. In our simple example, we only use the tables in the above
diagram, so only check them. Please notice the checkbox “Pluralize or singularize generated
object names” which is new to .NET 4.0 and which enables the plurals or singulars on objects:
7. After selecting the tables, click Finish. Now Visual Studio creates:
a. An edmx file, which is basically an XML file but Visual Studio has a designer for it.
b. Ad .designer.cs file – which contains the C# code for our entities.
Let’s spend a little time looking at the generated diagram. First, we see the 3 entities corresponding to
the 3 tables in our database. One very important aspect, which underlines the role of an O/RM, is how
relationship between entities are represented. Take for example the two tables from our database:
Author and Publication. A Publication has an Author, and we represent this by a Foreign Key – which is
the way of representing relationships in a relational data store. In our entities’ model, an object of type
Author has a collection of objects of type Publication (called Publications) and an object of type
Publication contains an object of type Author.
The Visual Studio Entity designer supports UML and is totally different from the older Class Diagram
designer. You can observe the Mapping Details section when into the designer, and also the Properties
available for entities and for properties of our entities. For example, if you click on Publication in the
designer, you will see these properties: Abstract = False, Access = Public, and so on – all referring to the
Publication class. If you select the Title property of Publication class, you see its properties, e.g. Nullable
= False (as it is set in the database) and so on.
Getting some data from the database
Now let us make some simple actions with the data. We will query the database for the list of available
Publications. In order for us to work with the entities and the supporting database, we need an instance
of a class of type ObjectContext which has been generated by the Visual Studio designer when we
created the model. In our case, this class is called DataLayerEntityFrameworkEntities, as you can check it
in the Model1.Designer.cs. An instance of an ObjectContext keeps all the communication with the
database and the references to all the loaded objects, and we will explain it more clear in the following
section.
So, in Program.cs – Main method, we put this code:
static void Main(string[] args) {
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.ToList(); foreach (Publication pub in publications) { Console.WriteLine(pub.Title); } }
We declare a list of Publication to store the query results into it. We get the publications by using the
context and its Publications property (a kind of queryable collection), but we need to apply the ToList()
method in order to create an actual query to the database and execute it and return the data. If you run
this program, you should get a result similar to this:
Now let’s play a bit with this code. Say we want to see not only the Publication’s Title, but also it’s
Author’s Name. We change the line which prints to the console as the following:
Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name);
Then we run the application and ...
Now, at this point a reader, who worked with Entity Framework from version 3.5 SP1, might be a little
surprised as I initially was. This is an improvement of .NET 4.0, which automatically loads the pub.Author
object even if we never say it explicitly. What happens in our case, is a Lazy Loading mechanism: we first
load the Publications (a query only to Publication table), then in the foreach loop at the request for
pub.Author, a new query is made behind the scenes to get the Author for the specific publication.
Lazy Loading is one option regarding references of entities. We have the other one, called Eager
Loading, which makes a query on all specified tables (performs a SQL JOIN actually). So let’s change our
code to this:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.Include("Author").ToList(); foreach (Publication pub in publications) {
Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name); }
The result of executing the code above is the same as the previous one. The only difference is how
queries are made in the database. So, i’ll recap here:
1. In case one (without any Include(...)), context.Publications.ToList() performs a query on the
Publication table. Then, pub.Author.Name performs a query on the Author table with a WHERE
clause.
2. In the second case, a SELECT with JOIN is performed on Publication and Author tables.
Adding data to the database
Now let’s say we want to add a new Publication to the database. We can do this by:
1. Creating a new object of type Publication.
2. Set all of the mandatory properties. These include setting an Author.
3. Save the new object to the database.
Here is the code for this scenario:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication newPub = new Publication() {
Title = "New Title", IsBookOrVideo = true, Author = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault() }; context.AddToPublications(newPub); context.SaveChanges(); // check if the new publication is in the database foreach (Publication pub in context.Publications.ToList()) {
Console.WriteLine(pub.Title + ". Author: " + pub.Author.Name); }
I have to explain a few things. Setting the Author property is done by first querying the database for an
Author with the name “Jane Doe” and if a result is present we assign it to the property. The Where(...)
method is an extension method present in LINQ (language integrated query) used to query the data
against which is run. In our case, we query entities so we are talking about LINQ to Entities here. The
FirstOrDefault() method returns the first element which matches the query or null in case of a void
result.
After we have the new Publication instance, we have to assign it to the ObjectContext – so it can keep
track of it from now on. At this point, our context knows about a new Publication but only in memory. It
will be added to the database the next time we call SaveChanges() on the context. Thus, it’s possible to
make many operations with the context objects in memory (add new objets, update or delete existing
ones) and at the point we call SaveChanges() a batch of operations (actually a Transaction) is sent to the
database.
And here is the result:
This is it for now. In the following sections I will explain how Entity Framework works and which are
some of the most common scenarios.
3.2. Entity Framework explained In this section I will explain what the underlyings of Entity Framework are.
Entities
An “entity” in Entity Framework is an item (class) which is described in a edmx file and we can see in the
designer. All objects which contain data from our model are instances of these entities. What makes an
entity:
1. Data properties – properties which contain the data, and which correspond to the table
columns.
2. Reference properties – objects or collections of objects which represent the references from the
database. In our example called “01_FirstApplication”, the Publication entity has a property
called AuthorReference of type EntityReference<Author>, and Author entity has a property
Publications of type EntityCollection<Publication>.
3. They don’t have behavior – like any other business entities, apart from the methods used to
keep track of entities in the ObjectContext.
ObjectContext
Is the primary class for interacting with entities. An instance of this class is needed in order to work with
the database underneith, as it encapsulates:
1. A connection to the database, which is managed transparently for the programmer, but which
can also be managed directly.
2. Metadata for the entities model.
3. An ObjectStateManager object that keeps the entities in the cache.
Working with data
With the help of Entities and the ObjectContext, we can work with data on the application side, in
memory, without having to interact with the database any time we want to change our data but only
when we want to presist it. Of course, in different architecture types we will have different strategies of
managing the ObjectContext (so the data cache) and sometimes it won’t be optimal to cache the data in
memory but we will prefer a stateless ObjectContext. More on this subject, in the chapter Using Entity
Framework in different architectures.
Getting data ultimately means creating a query and running it against a database. We do not explicitely
create a SQL command (as we were used to do it in ADO.NET 2.0) but use the entities and
ObjectContext. Here is the example we saw in the previous section:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> publications = context.Publications.ToList();
The instance of ObjectContext, context needs to be created first. After this instantiation we can work with data as long as this object exists. The next line actually creates and executes the query against the database, like this:
1. When we call context.Publications we tell Entity Framework to prepare a query. Of course, we can apply extension methods like Where(...) and in this case the query becomes more specific.
2. When we call ToList() or any method which returns entities or collections of entities (like First(), FirstOrDefault()) the query prepared in step 1 is sent to the database server.
3. The resulted data is sent back to Entity Framework, via the connection to the database, and the relational structure is automatically transformed into entities. At this point, data is loaded in the ObjectContext cache and will continue to exist there as long as the context is alive.
3.3. Common scenarios Out of experience with Entity Framework I have seen some scenarios of usage which are very common.
Almost every EF enabled application will use most of the scenarios listed in this section, and there may
be others which I didn’t mention here.
To exemplify these scenarios, I will use the application “02_CommonScenarios_1” from the companion
code archive, with the same database “DataLayerEntityFramework”.
Loading references (child Entities). Lazy versus Eager Loading
Our sample application already has the model of entities: Author, Publication, PublicationResource. We
will next load an Author together with all his Publications and all of the PublicationResources attached
to them. If you remember the first of our examples, a bit earlier, there are 2 options regarding
references loading: Lazy Loading and Eager Loading.
Here is the code which does what I mentioned in a Lazy Loading fashion:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault(); Console.WriteLine(author1.Name); foreach (Publication pub in author1.Publications) {
Console.WriteLine("\t" + pub.Title); foreach (PublicationResource pubRes in pub.PublicationResources) { Console.WriteLine("\t\t" + pubRes.ResourceName);
} }
The result after executing the application:
We may not want EF to perform this way, since it’s not very performing. I mean, if you were to run the
application with the debugger, and you will trace the commands executed on an MS SQL Server instance
with SQL Server Profiler, you will notice which are the queries sent to the database engine.
1. Put a breakpoint on the first Console.WriteLine statement, immediately after loading the
author. Then look in the Profiler and you see that the command excuted only queries the Author
table.
2. Put a breakpoint on the next Console.WriteLine statement, inside the first foreach. Run the
application up to that breakpoint, and look into the Profiler to see that another query is
performed, thi time looking for all the Publications for a specific AuthorId.
3. Then put a breakpoint to the last Console.WriteLine statement, and skip directly to it. Look in
the Profiler and see that another query is done, which ultimately selects all
PublicationResources for a PublicationId which belongs to an AuthorId.
We can see that for every Author (in our case is only one) a query is made to get his Publications, and
then for every Publication, a query is done to get its PublicationResources. If we had 1000 Authors, each
with 100 Publications, and each Publication with 20 resources - you can imagine the time spent to get
that result we were looking for.
We can achieve this same result, by eagerly loading all the references. Here is the code for this option:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors
.Include("Publications").Include("Publications.PublicationResources") .Where(a => a.Name == "Jane Doe") .FirstOrDefault();
Console.WriteLine(author1.Name); foreach (Publication pub in author1.Publications) {
Console.WriteLine("\t" + pub.Title); foreach (PublicationResource pubRes in pub.PublicationResources) { Console.WriteLine("\t\t" + pubRes.ResourceName);
} }
Please notice the Include() methods, and their parameters. For getting child entities in a cascading
manner: Author->Publication->PublicationResource, starting from Author we have to include all
relationships. The first one is called “Publications” – notice the plural used here, as it’s a difference from
EF in .NET 3.5 SP1. The next reference is specified by it’s full name “Publications.PublicationResources”.
Only one query is executed against the database, which performs a join between all three tables
involved: Author, Publication, PublicationResource. It is quite large, but I included it here:
Delayed query execution
The moment of query execution in the database is important, and can be specified by us depending on
how we write our EF code. Let’s take this example:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Author> authors = context.Authors.ToList(); foreach (Author author in authors) {
Console.WriteLine(author.Name + "\t" + author.Address); }
The query is executed in the database engine immediately after calling ToList() method, and after that
we have the authors object loaded with data. Now let’s look at this piece of code:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); var authors = context.Authors; foreach (Author a in authors) {
Console.WriteLine(a.Name + "\t" + a.Address); }
In this case, when the program execution hits the foreach statement no query has been submitted to
the database engine yet. A query has only been created by EF inside the anonymous object “authors”
and is waiting to be used, but the execution only takes place when first referencing “authors”. This
option of delaying query execution is very helpful when considering performance, and we will look
deeper into this aspect later on.
Inserting real life data
We will first add a new Publication into the database. We have to:
1. Create a new object of type Publication
2. Set all of its mandatory properties. They are mapped to the non-nullable columns in the
database, so it’s easy to see the mandatory columns.
3. A specific case of mandatory properties are the references. In our case, we must set an Author
for the new Publication.
4. We add the new object instance to the context, using one of the extension methods created by
EF, specific to the object type. In our case, we call AddToPublications() method.
5. Only when we want to persist the changes in the database we need to call SaveChanges()
method on the context. Until then, anyway, our object can be used from the memory – the
context ObjectCache.
So, this is a sample code for adding a new Publication for a specific Author, “Jane Doe”:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication pub = new Publication(); pub.Author = context.Authors.Where(a => a.Name == "Jane Doe").FirstOrDefault(); pub.IsBookOrVideo = true; pub.Title = "Developing web applications with Silverlight 3.0"; context.AddToPublications(pub); context.SaveChanges(); // checking the new title in the database foreach (Publication p in context.Publications) {
Console.WriteLine(p.Title); }
In this sample, when setting the Author of the new Publication we first make a query against the
database to get a specific Author. We will see further on that there is another way of setting the
Publication’s Author, without making a suplimentary call to the database.
Here is the result after executing the code:
Another possibility to set the Publication’s Author is to set the AuthorId property, without calling the
database. In real-life situations, we may not want to make an extra call to the database if we already
know the unique identifier of the Author we want to set to our Publication.
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Publication pub = new Publication(); pub.AuthorId = 2; pub.IsBookOrVideo = true; pub.Title = "Developing web applications with Silverlight 4.0"; context.AddToPublications(pub); context.SaveChanges(); // checking the new title in the database foreach (Publication p in context.Publications) {
Console.WriteLine(p.Title); }
Notice that instead of setting the Author property of our new Publication, we set the AuthorId property.
Now, I know this is not an “Object Oriented” approach, it’s more like a data-oriented approach, but I
don’t see any problem using this kind of “shortcuts” if they help performancewise. Of course, the
limitation of this case is that we already know the Id of the Author we want to set.
Here is the result:
Updating real life data
Let’s say we want to change the Title of a Publication from our database. Here is the simplest code we
can write in order to achive the goal:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); // display all existing publications foreach (Publication p in context.Publications) {
Console.WriteLine(p.Id + "\t" + p.Title); } Console.WriteLine("Enter the Id and Title for updating the chosen Publication:"); Console.Write("Id: "); int pubId = Convert.ToInt32(Console.ReadLine()); Console.Write("Title: "); string pubNewTitle = Console.ReadLine(); Publication pub = context.Publications.Where(p => p.Id == pubId).FirstOrDefault(); pub.Title = pubNewTitle; context.SaveChanges(); // checking the updated title in the database foreach (Publication p in context.Publications) {
Console.WriteLine(p.Id + "\t" + p.Title); }
Please take a look at what we do in the code above:
1. We first list all Publications.
2. The application requests entering the Id of a Publication to be modified. Then, it requests the
new Title.
3. A query is made to select the Publication with the given Id.
4. We change the Title property on the object.
5. We save the changes to the database.
6. We list again the Publications to check that everything worked ok.
After running the code:
A problem I want to mention is related to step 3: in order to select the Publication we want to change,
we make a query to the database. Maybe there is no need to do that, especially that we have already
loaded all the Publications before, to list them, and I said before that the context manages the
ObjectCache – it actually remembers the objects which have been loaded before. Another way to
accomplish the same thing is to replace FirstOrDefault() method with another one, which does not imply
a call to the database but a call to the ObjectCache, and only if the searched object cannot be found
there it will go to the database for it.
So, in the code above, we replace this line:
Publication pub = context.Publications.Where(p => p.Id == pubId).FirstOrDefault();
With this line:
Publication pub = (Publication)context.GetObjectByKey(new System.Data.EntityKey( "DataLayerEntityFrameworkEntities.Publications", "Id", pubId));
And here is the result:
GetObjectByKey() needs a parameter of type EntityKey, which represents the way EF uniquely identities
objects. The ObjectContext can return an object based on its EntityKey, which is formed by: the entity
set name (fully qualified), the key name (the name of the property / column which uniquely identifies
the objects of the given type), and the value of the unique property / column.
So, GetObjectByKey() first looks into the ObjectCache for an object searching it by the EntityKey. If it
finds it there, it returns it, if not – a call is made to the database. This approach is more performant then
using FirstOrDefault(), but does not guarantee that the object it returns has the latest state: it is possible
that between the first load of the object and the GetObjectByKey() call, the values in the database have
been changed.
Adding computed properties to an Entity
Computed properties are properties of our entities which are computed at runtime, and do not map on
a specific column in the database. An example would be: we would like to have a string property on
Author which contains both his name and address, let’s call it Contact. Now, there is no way we can add
this property in the Entity Designer, since EF forces us to map all properties to columns from the
database.
A workaround is to have a new file for the partial class Author, where we add this new property:
partial class Author {
public string Contact {
get { return Name + "\t" + Address;
} }
}
Now we can write a piece of code which shows the value of this new Property:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author a = context.Authors.FirstOrDefault(); Console.WriteLine(a.Contact);
Resolving concurrency issues
In a client-server scenario, where you have the ObjectContext running on a client application (console,
windows, Silverlight, etc) and the database on a shared database server, concurrency issues may appear.
Let’s exemplify this:
1. We have User1 who starts an application which lists all Authors.
2. We have User2 who does the same.
3. User1 changes the Name of the Author with Id=1. The changed value goes into the database.
a. Id = 1; Name = “Changed by User1”; Address = “New York”
4. User2 changes the Address of the same Author with Id=1. The updated Author is persisted into
the database, but because it is taken from memory – the old Name is persisted thus overriding
the changes of User1.
a. Id = 1; Name = “John Smith”; Address = “Changed by User2”
Here is the sample code for this scenario:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); Author author1 = context.Authors.Where(a => a.Id == 1).FirstOrDefault(); Console.WriteLine(author1.Name + "\t" + author1.Address); Console.Write("New Name: "); string name = Console.ReadLine(); Console.Write("New Address: "); string address = Console.ReadLine(); author1.Name = name; author1.Address = address; context.SaveChanges();
We have to start 2 instances of the application. In instance 1, we enter a new Name for the Author
keeping the same Address, and we save. In instance 2, we keep the Name and enter a new Address.
Then we look into the database and see that the changes from instance 1 have been overridden by the
ones from instance 2.
We enter data in parallel:
We hit Enter in the instance above, so it finishes execution and saves a new Name in the database:
Then we hit Enter on the second instance, which finishes execution and saves its data:
Now we take a look into the database:
We notice that the Author with Id=1 has the values from the second instance of our application, so the
changes performed in instance 1 were overwritten.
Maybe this is the behavior we want, but sometimes we might need to know when somebody tries to
update some objects in the database but has an old version of it. EF gives us a very simple way of
resolving this concurrency issue: in our Entity Designer, we can select the properties of an entity whom
we want to watch for concurrency issues, and change their Concurrency Mode attribute to Fixed:
We apply this change for both Name and Address properties of Author. Then we run the two
applications again, and repeat the steps above to get this result:
We can see that on the second update we get an exception, which basically tell us that our data is out of
sync with the database. EF is able to tell this, because of the way it builds the command which is
supposed to update the Author. Here is the command sent to the database server:
We can see that there is a WHERE clause which not only searches for the Id of the Author, but also for
all fields which have been set as Concurrency Mode = Fixed: Name and Address. Since this command
does not find an Author (the Name has been already changed by another application instance), it
returns this to EF which throws an exception.
This exception can be caught, and inside the catch block we can chose what to do. For example:
try {
context.SaveChanges(); } catch (OptimisticConcurrencyException ex) {
context.Refresh(System.Data.Objects.RefreshMode.StoreWins, author1); }
In this case, our in-memory Author is refreshed with the newest values from the database. From now
on, we may choose to do anything we’d like, for example try again to perform the update which will
work because our object is in sync.
Mapping stored procedures to the model
The sample application used for this section is found in the folder called “02_CommonScenarios_2”.
Very often some of our logic needs to stay at the database level, say inside some stored procedures.
There are many situations when we would prefer to use stored procedures instead of classic procedural
code (business logic code), and some examples would be: encrypting and decrypting data, working with
geospatial data types, or simply choosing to do all the CRUD (Create/Read/Update/Delete) operations at
the database level to leverage the power of a database engine in terms of performance.
We use the same database “DataLayerEntityFramework”, with 3 tables Author, Publication and
PublicationResource. For a simple scenario, let’s say we want to create a Stored Procedure which gets all
the Publication of a specific Author, and we want to execute that Stored Procedure using our Entities
model.
Here are the steps to accomplish our goal:
1. We create the stored procedure. The database backup in the example folder already contains a
stored procedure called “[dbo].[Get_Publications_ForAuthor]”, which only contains this
statement and an input parameter of type int:
SELECT * from Publication where AuthorId = @AuthorId
2. In Visual Studio, inside the edmx designer, we right click and select Update Model From
Database.
3. In the Update Wizard, in the Add section we check the Stored Procedure and then click Finish
and save our edmx file:
4. Inside the edmx designer, we right click, and select Add -> Function Import.
5. We complete the name of the function, select the stored procedure from the list, and select the
return type which is a Collection of Publication.
6. To test the function import, write the following code inside the Main method:
DataLayerEntityFrameworkEntities context = new DataLayerEntityFrameworkEntities(); List<Publication> pubs = context.GetPublicationsForAuthor(1).ToList(); foreach (Publication pub in pubs) {
Console.WriteLine(pub.Title); }
7. And here is the result:
We saw that using Stored Procedures from Entity Framework is quite simple. Actually, this was even
possible in the first version of EF which comes with .NET 3.5 SP1, but that one had some limitations. One
of the most important limitations was the inability of EF to map a function import on a Stored Procedure
which does not return a result composed of defined Entities. For example, if we wanted our stored
procedure above to get not only the Publications but also their PublicationResources, in the previous
version of EF it wouldn’t be possible.
So, EF from .NET 4.0 provides the ability to map the resultset coming from a stored procedure to a new
complex type, which is not necessarily an Entity in the model. Here are the steps to map such a stored
procedure to the model and use it:
1. We have a stored procedure called “GetPublicationsAndResourcesForAuthor” which looks like
this:
Select p.*, pr.ResourceName, pr.ResourceUrl from Publication p
inner join PublicationResource pr on p.Id = pr.PublicationId
where p.AuthorId = @AuthorId
2. We update our model to include the new stored procedure.
3. We right click on the edmx designer to add a Function Import, choose the new stored
procedure, click on the Get column Information button to get the resultset schema, and then
click on Create New Complex Type to create a new type which will store the result:
4. The stored procedure is now mapped into our model, and will return a collection of the new
type PublicationWithResource. Here is the test code:
List<PublicationWithResource> pubsRes = context.GetPublicationsAndResourcesForAuthor(2).ToList();
foreach (PublicationWithResource pr in pubsRes) {
Console.WriteLine(pr.Title + "\t\t" + pr.ResourceName); }
5. And here is a possible result:
POCO Entities
POCO stands for Plain Old CLR Object – an object with nothing particular but some properties defined on
it. EF 3.5 did not give us the possibility to create POCO entities. Our entities were restricted by some
conditions: any EF entity must inherit from EntityObject, or implement some EF specific interfaces -
which made heavy the task of creating domain classes independent of any persistence concerns. EF 4.0
however gives us the possibility to create types which do not need to inherit from EntityObject or
implement any of the interfaces.
When using POCO entities, we benefit from all functions of Entity Framework, as if we were using
EntityObject – enabled entities, but everything is much simpler.
For this section, we will use the sample application and database in folder “02_CommonScenarios_3”.
The application has already defined the model and the POCO entities, which was done following these
steps:
1. In a new Console Application project, right click and select Add -> New Item then choose
ADO.NET Entity Data Model.
2. Walk through all the wizard of creating the model.
3. At the moment when Visual Studio opens the edmx designer, before the .Designer.cs file is
created with some generated code, change one property in the tool, Code Generation = None:
4. Save the edmx file and notice that the .Designer.cs file is empty (actually it has an explanatory
comment).
At this point, we have the edmx file (model, mappings, etc.) but we do not have our entity classes, so we
can begin writing our own POCO entities. Here it is:
1. We add a new class to the project, named Author:
public class Author {
public int Id { get; set; } public string Name { get; set; } public string Address { get; set; } public List<Publication> Publications { get; set; } }
2. We add a new class to the project, named Publication:
public class Publication {
public int Id { get; set; } public string Title { get; set; } public bool IsBookOrVideo { get; set; } public int AuthorId { get; set; } public Author Author { get; set; } }
3. We create our ObjectContext class, in order to be able to manage the entities:
class DataLayerContext: ObjectContext {
public DataLayerContext() : base("name=DataLayerEntityFrameworkEntities", "DataLayerEntityFrameworkEntities") { _authors = CreateObjectSet<Author>(); _publications = CreateObjectSet<Publication>();
} public ObjectSet<Author> Authors { get { return _authors; } } private ObjectSet<Author> _authors; public ObjectSet<Publication> Publications { get { return _publications;
} } private ObjectSet<Publication> _publications; }
4. Now we have: the model, the POCO entities written as we wanted, and the ObjectContext class.
We can test that everything is working as it should by inserting this code in the Main method:
DataLayerContext context = new DataLayerContext(); Publication pub = context.Publications.FirstOrDefault(); Console.WriteLine(pub.Title);
5. And here is the result:
Mapping the POCO entities
You may be wondering how EF does the mapping and correlation between our POCO entities and the
metadata (edmx). In EF 3.5, this was achieved by attributes but now it’s based on convention: entity
type names, property names, complex type names used in our POCO classes must match those defined
by the conceptual model.
The demo application “02_CommonScenarios_3” includes some examples of simple operations (CRUD) –
I won’t go through all of them here, but see some differences from EF 3.5 and discuss the more
important aspects.
Fixing up relationships
For example, let’s say you want to add a new Publication to the database. One way to do that, as we
have already seen, is to create a new object of type Publication and set all its properties with desired
values, including the references. Another way to set the Publication’s Author, besides its AuthorId
property, is to simply add it to the collection of an existing Author:
Author auth1 = context.Authors.Include("Publications").FirstOrDefault(); auth1.Publications.Add(new Publication() {
Title = "New Pub Test", IsBookOrVideo = true });
context.SaveChanges();
The interesting aspect in the previous example is that EF automatically sets the reference between the
Author and the Publication.
Other POCO benefits
There are many benefits when creating our POCO entities and not using the old-style EF entities. I will
list some of them, but I am sure there are more:
1. Our POCO entities are simpler to read, because they do not carry extra stuff like inheritance
from EntityObject, attributes needed for mapping, etc.
2. Now it’s much simpler to add computed properties, because our classes are not designer-
generated but created by our hand.
3. In EF 3.5 I had this problem: I wanted to decorate some properties (which were mapped to
columns) with attributes, say for business rules validation. This was not supported, and I could
do this by some workarounds only (e.g. create a copy of the property, and use it from the
business logic perspective). Now we don’t have to worry about anything, just put our attributes
where we want them to be.
So, I changed the Address class above with the following, to include a computed property and some
custom attributes.
public class Author {
public int Id { get; set; } [PersonNameValid] public string Name { get; set; } public string Address { get; set; } public List<Publication> Publications { get; set; } public string Contact { get { return Name + "\t" + Address; } } }
So, please notice the computed property Contact – which is not mapped to any column in the database
and does not have a setter, and the “PersonNameValid” attribute used to validate user input. The
attribute and its logic is found in the “02_CommonScenarios_3” solution, and I don’t include that code
here because of space requirements.
Here is the test code for the custom property and the attribute:
DataLayerContext context = new DataLayerContext(); Author auth1 = context.Authors.Include("Publications").FirstOrDefault();
Console.WriteLine(auth1.Contact); auth1.Name = "Test/Not valid"; BusinessValidation.Validate(auth1); context.SaveChanges();
And the result is showing us the computed property value, and throws an exception because the name
entered is not valid:
Lazy loading with POCO entities
You already know that lazy loading is possible in Entity Framework, even in its first version. This comes
out of the box when creating our entities with the default code generation technique, but when we are
using POCO entities we must explicitly tell EF how to handle lazy loading.
There are two things we must do to have lazy loaded references on our POCO entities:
1. Declare the property which represents the reference as virtual.
2. Enable deferred loading on the hand-crafted context that we have.
Ok, so here is the following code which tries to print the Title of the first Publication of the first Author:
DataLayerContext context = new DataLayerContext(); Author auth1 = context.Authors.FirstOrDefault(); Console.WriteLine(auth1.Publications.FirstOrDefault().Title);
Here is the result:
To enable lazy loading on our entity, we change the Publications property in Author:
public virtual List<Publication> Publications { get; set; }
Then we enable lazy loading on the context:
context.ContextOptions.LazyLoadingEnabled = true;
And here is the result:
So what does EF under the covers? When seeing a virtual property, EF provides a dynamic proxy
instance of our POCO entity, at runtime. In English: EF creates a dynamic class based on the POCO
entities, but with some other properties which permits it to perform lazy loading. Here is the auth1
variable at runtime – and notice its type:
Explicit loading
Though lazy loading is quite good in terms of performance tuning, you might want to load the navigation
properties at specific moments in time, and to be in complete control over the operation.
This is possible, because now with EF 4.0 you can explicitly load relationships on POCO entities when you
like.
Here is the sample code for explicitly loading a reference / navigation property:
Publication pub = context.Publications.Where(p => p.Id == 8).FirstOrDefault(); context.LoadProperty(pub, p => p.PublicationResources); foreach (PublicationResource pubRes in pub.PublicationResources) {
Console.WriteLine(pubRes.ResourceName); }
Generating the code for POCO entities
These POCO entities that we write on our own, do not always have to be written from scratch. We can
use some code generation templates, enabled by the tool called Text Template Transformation Toolkit
(or T4 in short).
Many to many entities
The demos for this section use the database called “Conferences” which has the following structure:
We have Conferences, Users, and all Users can participate to any conference. This is a Many to Many
relationship between User and Conference. When we want to create an object model for this database
using Entity Framework, we discover that EF is pretty smart about handling this type of relationships: it
only creates 2 entities – Conference and User, like this:
Of course, there is no need for an extra entity which would correspond to UserAtConference table as
long as UserAtConference does not have any other fields but the IdConference and IdUser – relationship
fields. So EF is smart and eliminates an unnecessary entity, but if UserAtConference had extra fields,
then EF would create an entity corresponding to this table.
If we want to add an existing User to an existing Conference, here is how to do it:
ConferencesEntities context = new ConferencesEntities(); Console.WriteLine("Add a User to a Conference"); Console.WriteLine("Username: "); string username = Console.ReadLine(); Console.WriteLine("Conference: "); string conferenceTitle = Console.ReadLine(); User userToAdd = context.Users.Where(u => u.Username == username).FirstOrDefault(); Conference confDest = context.Conferences.Where(c => c.Title == conferenceTitle)
.FirstOrDefault(); confDest.Users.Load(); confDest.Users.Add(userToAdd); context.SaveChanges();
If we want to add a new User to an existing Conference (which means saving the User to the database
and adding it to a Conference):
ConferencesEntities context = new ConferencesEntities(); User userNew = new User() {
Address = "Bucharest, Romania", Username = "mircea" }; Conference conf1 = context.Conferences.FirstOrDefault(); conf1.Users.Add(userNew);
context.SaveChanges();
Inheritance
In many real-life situations, we find ourselves in the place where we have hierarchies of entities, which
need to be stored in the database behind our application. Most of the time we would like to represent
such entities in a hierarchical manner in the application – we would have classes like User (abstract) and
UserAdmin and UserPublic which inherit from User – but in the database there are more possibilities to
store this data.
Table per hierarchy (TPH)
In this model, we store all the entities in one table, having a column or a set of columns as differentiator
between concrete types. For the example above, we would have one table User with a column
IsAdmin(bit) as differentiator. Of course, there are columns which are specific to a concrete type, and
these should be marked as nullable since the other types cannot have values for those columns.
Table per type (TPT)
Here the properties of the base type are stored in a table which is shared between the other “concrete”
tables, which only store specific data. For our example above, we have 3 tables: User – with all common
data, UserAdmin – with the properties specific to the Admins, and UserPublic – with the properties
specific to the Public type. The latest tables each have a foreign key relationship to the shared table, in
order to keep the unique id across types.
Table per concrete class (TPC)
In this model, there is one table for each concrete type, with a column for each property in that type.
For our situation: table Admin (Id, Username, Team) and table Public (Id, Username, Address). Of
course, we must have a way to determine the uniqueness of the Ids across these 2 tables – maybe we
don’t want to have a Public user with Id = 20, and an Admin user with the same Id = 20.
Choosing between strategies
In short, I would not recommend TPC, since it does not seem too intuitive at least for me. Another
limitation for TPC, is the lack of the Entity Framework designer’s support.
For TPH, the best argument is performance: we do not need to join 2 tables when we query the
database for data of a specific type. Personally, I use this strategy most of the time, but there are some
drawbacks which must be taken into consideration.
So, the TPT advantages over TPH are:
Flexibility – it is easy to add a new type to the model, since you only have to add a new table.
Data validation – the need to have nullable columns in TPH does not allow for validation at the
database level. In TPT, we simply do not store in the same table data from different types.
For a more detailed comparison between strategies, please read this blog post:
http://blogs.msdn.com/alexj/archive/2009/04/15/tip-12-choosing-an-inheritance-strategy.aspx, of Alex
James, a Program Manager working on the ADO.NET team at Microsoft.
Samples
The samples for this paragraph are found in the 02_CommonScenarios_Inheritance folder. Here is the
database diagram used in this paragraph:
The database contains 4 tables: one table called User for the TPH model and 3 more for the TPT model.
TPH sample
In the solution found in the 02_CommonScenarios_Inheritance folder, there are more samples. Let’s
first discuss the TPH model of storage, by describing the steps necessary to map our entities with the
User table:
1. Create the User table in the database. It contains all the columns, for all types of concrete users.
2. In Visual Studio, add a new Entity Data Model, and select the User table from the database.
3. The EF model in Visual Studio now contains an Entity called User, which is a 1:1 representation
of the User table. This is not what we need, but we must have 3 entities in our application
model: an abstract class User, and 2 concrete classes, UserPublic and UserAdmin namely. In
order to achieve this, we must manually create 2 entities in the model.
4. Add a new Entity in the model, and name it UserPublic and set it to inherit from User. For this, in
the edmx designer perform right click -> Add -> Entity:
5. Do the same for UserAdmin. Now the model should look like this:
6. If you Build the solution now, you will notice some errors. They are due to the fact that our
entities are not mapped properly. We still need to do some things in order to have everything
set.
7. Select the entity UserPublic and in the Mapping Details window, set it’s mapping to the User
table, and put the condition IsAdmin = false:
8. Do the same for UserAdmin: map it to User table and add the condition IsAdmin = true.
9. Now, our User class contains the IsAdmin property which is no longer needed, since the derived
classes use its value as a type differentiator. So, remove the IsAdmin property and change the
User class to Abstract: true in the Properties window:
10. Build the solution and you should see no errors.
11. Now, if we want to see the Public Users or the Admins, we need to write some code like the
following, using the OfType<>() extension method from the Entity Framework:
DataLayerEFInheritanceEntities context = new DataLayerEFInheritanceEntities(); List<UserAdmin> admins = context.Users.OfType<UserAdmin>().ToList(); foreach (UserAdmin admin in admins) {
Console.WriteLine(admin.Username + "\t" + admin.Team); }
12. Run the application, and you should see the users of type Admin:
TPT sample
This sample is also included in the 02_CommonScenarios_Inheritance folder and uses the same
database as the previous one. Here are the steps needed to create our model in a TPT scenario:
1. Add an Entity Data Model to the application, and select the 3 tables called User2, User2Admin,
User2Public. The edmx designer should show something like this:
2. Notice that EF mapped association relationships between our entities, but we need inheritance.
So, we will remove existing relationships and create new ones. For table User2Admin, delete the
association, then right click the entity -> Add -> Inheritance, select User2 as base type and
User2Admin as derived type:
3. Do the same for User2Public, and then the model should look like this:
4. If you build now, you will notice some mapping errors, because we should not still have the
UserId properties in the derived classes. They are not needed in the derived entities, since they
should inherit the Id property from the base type (remember: our Ids are unique among the
tables).
5. For both derived entities, we have to delete the UserId property, and then map the UserId
column from the underlying table to the inherited Id property:
6. The last step we must do is mark the User2 class as abstract. Now if we build, there are no
errors and our model is ready to be used. Here is a sample code to get the entities using our
model:
DataLayerEFInheritanceEntities1 conetxt = new DataLayerEFInheritanceEntities1(); List<User2Admin> admins = conetxt.User2.OfType<User2Admin>().ToList(); foreach (User2Admin admin in admins) {
Console.WriteLine(admin.Username + "\t" + admin.Team); }
7. And an expected result:
TPT with POCO entities sample
Since EF 4.0 introduced the possibility of creating our own POCO entities, I think this would make an
interesting sample. So, the scenario is Table per Type, but we write our own entities (see the POCO
paragraph for more details).
Before we start, I must warn you to create the edmx (EF model) in a separate application (found in the
02_CommonScenarios_Inheritance_Poco folder) – so to make sure that it’s the single edmx in an
assembly. When I first tried to make the demo for this paragraph I encountered the following error,
because my TPT POCO edmx was created in the same assembly as the other 2 models used for the
previous examples: “Mapping and metadata information could not be found for EntityType”.
So, in order to have your TPT strategy implemented using POCO entities, you have to follow these steps:
1. Create the model mapped to the 3 tables: User2, User2Admin, User2Public, but do not generate
the code (remember POCO entities from a previous paragraph!):
2. Perform all the needed steps to have the TPT entities model (as in the previous sample).
3. Create all the POCO entities by hand. You will have 3 classes: User2, User2Admin, User2Public –
with the corresponding properties.
4. Create your own ObjectContext.
5. Write a sample code to test everything:
InheritancePocoContext context = new InheritancePocoContext(); foreach (User2Admin admin in context.Users.OfType<User2Admin>()) {
Console.WriteLine(admin.Username + "\t" + admin.Team); }
(To come …)
4. Advanced topics in Entity Framework
4.1. Transactions
4.2. Audit and logging with Entity Framework
4.3. Improving performance
4.4. Encrypting data in the database
4.5. Using Entity Framework in different architectures