Lesson 3 - Approaches to working with relational databases in .NET
In the last lesson, Creating a Local Database in Visual Studio, we prepared a database of words.
Today's tutorial will be theoretical, we'll introduce problems of the relational and object-oriented world, as well as approaches we can use to work with databases in C# .NET.
Contradiction of object-oriented and relational approaches
The object-oriented world and the world of relational databases (the relational world) are very different. These are 2 different philosophies that I dare say here are incompatible.
Relational databases are a proven way to work with data. Although there are also fully object-oriented databases, it isn't worthwhile for companies to invest money in them and therefore they haven't been adopted yet. After the revolution in programming and the arrival of objects, of course, there was a problem with storing data, because relational databases cannot store objects. There are several ways to deal with this.
1. Non-object-oriented programming
The first option is, of course, to program completely without objects. However, this way we'd go out on a limb, we wouldn't be able to use any 3rd party components and our code would be of a very poor quality. Since C# is an object-oriented language, it wouldn't work well enough.
2. Database wrapper
The wrapper approach allows us to work with a database as if we've worked with an object, however, we still communicate with it in its SQL language. So we mix object-oriented and relational code. The approach is a kind of compromise and requires the OOP philosophy to customize a little. The advantage is maintaining the database performance and capabilities at the cost of slight degradation of OOP ideas.
We most often see data from a database as values in a table (which is an object) and we lose the opportunity to assign some functionality to entities. Instead, we group it into managers. It's also possible to partially map data to existing classes, however, we won't achieve a full-fledged concept of the object model.
3. Object-relational mapping
Object-relational mapping (ORM) follows the idea of OOP orthodoxly. So instead of an array of values, we get objects directly from a database, and they have methods on them. We don't communicate at all in the SQL language, we see tables in a database as a collection of objects with which we can work using common assets of a language. In fact, we're completely shielded from working with a relational database. Sounds great, doesn't it?
The catch, of course, is that there's a great degradation in database performance in the background, SQL queries are generated automatically and are often inefficient. Another problem with ORM is that it's very complex (not to use, but to program). Fortunately, C# comes with a perfected ORM built-in, so we don't have to deal with anything. On the contrary, for example, running ORM in PHP isn't easy and I prefer the wrapper approach there.
Opinions on ORM are very controversial, e.g. that its very idea is incorrect, because generated SQL code simply cannot be effective and it's necessary to think about its final form, so the abstraction isn't complete. Personally, I have a neutral attitude towards ORM, and if someone provides me with a standardized ORM along with a language, I'll be happy to use it. If not, I can do without it.
4. Object databases
In addition to relational databases, there are also the already mentioned object databases. These solve the problem of incompatibility of object-oriented and relational approach. They provide the same convenience as ORM, but internally there's no need to convert data into tables, they're stored directly as objects. Theoretically, there's no performance issue or other reason why they shouldn't replace relational databases. Unfortunately, they're barely used in practice and we can only hope that this will change over time. Those interested can look, for example, at the MongoDB project.
Database connection options
In .NET, we have several options for using a database in our application. I won't describe them in detail, let's just say something briefly about each option. It's important that we know they exist. Databases are a very extensive topic, so if you need a given option, you can find out more about it.
Connected application
We use the connected application approach when we often need to read or
change data in real time. Using the DataReader
,
Command
and Connection
classes, we send SQL commands
directly to the database and then get results.
I illustrated the situation in the image:
Disconnected application - DataSet
The disconnected application approach works in such a way that we have a
DataSet
in the operating memory, which contains data from a
database. The application works with the DataSet
and occasionally
the DataSet
synchronizes with a production database on a server (a
disk). For the price of less up-to-date data, we get increased speed and more
comfortable work. This approach is summarized by the ADO.NET component that is
part of the .NET Framework.
DataSet
contains tables, table rows, and column rows. A table is
an object, we can add rows to it and edit them without writing SQL code. If we
want to run a command on the database, we use the DataAdapter
to
fill our DataSet
with data. We must already write the commands in
the SQL language of the database.
So we have a certain object abstraction, we work with tables as with objects, but data is still just columns in a table, not instances of some objects. We also still use SQL. In terms of the above approaches, it is the wrapper one.
LINQ To SQL
LINQ to SQL provides a complete object abstraction over a database
(object-relational mapping). We work with a database as if it was, for example,
a List
of objects and we don't have to worry about SQL queries at
all, we don't know about tables or columns, everything happens in the background
automatically, a query returns full fledged objects to us. The price for such a
luxury is worse query optimization, which usually doesn't matter.
Our application includes DataClasses
, which are object
structures of a database. They contain classes for individual tables, table
columns are properties of given classes. In the application, we communicate only
with DataClasses
, which then communicate with a database in the
background using LINQ to SQL and performs SQL queries for us. We don't come into
contact with the SQL language at all and we work with the database as an object
structure in the memory.
Next time, we'll describe this technology thoroughly and create our first database application.
Entity Framework
Entity framework goes even further than LINQ to SQL. It's a competitive and more advanced technology than LINQ to SQL. In addition, it supports many-to-many relationships. This approach is best used, and if you aren't limited by your application's architecture or performance, it's a good choice for most applications.
In our course, we'll gradually introduce all the approaches, each suits something different, and you should know how to use them in practice.
In the next lesson, Connected Database Applications in C# .NET, we'll introduce the connected application approach.