A database is a collection of information that is organized so that it can be easily accessed, managed and updated. Computer databases typically contain aggregations of data records or files, containing information about sales transactions or interactions with specific customers.
In a relational database, digital information about a specific customer is organized into rows, columns and tables which are indexed to make it easier to find relevant information through SQL or NoSQL queries. In contrast, a graph database uses nodes and edges to define relationships between data entries and queries require a special semantic search syntax. As of this writing, SPARQL is the only semantic query language that is approved by the World Wide Web Consortium (W3C).
Typically, the database manager provides users with the ability to control read/write access, specify report generation and analyze usage. Some databases offer ACID (atomicity, consistency, isolation and durability) compliance to guarantee that data is consistent and that transactions are complete.
Types of Databases
Databases have evolved since their inception in the 1960s, beginning with hierarchical and network databases, through the 1980s with object-oriented, and today with SQL and NoSQL, and cloud databases.
In one view, DB’s can be classified according to content type: bibliographic, full text, numeric and images. In computing, DB’s are sometimes classified according to their organizational approach. There are many different kinds of DB’s, ranging from the most prevalent approach, the relational, to a distributed, a cloud, a graph, or NoSQL database.
Relational DB’s are made up of a set of tables with data that fits into a predefined category. Each table has at least one data category in a column, and each row has a certain data instance for the categories which are defined in the columns.
The Structured Query Language (SQL) is the standard user and application program interface for a relational DB’s. Relational DB’s are easy to extend, and a new data category can be added after the original database creation without requiring that you modify all the existing applications.
Distributed DB’s are databases which portions of the DB are stored in multiple physical locations, and in which processing is dispersed or replicated among different points in a network.
Distributed DB’s can be homogeneous or heterogeneous. All the physical locations in a homogeneous distributed DB system have the same underlying hardware and run the same operating systems and DB applications. The hardware, operating systems or DB applications in a heterogeneous distributed database may be different at each of the locations.
A cloud DB is a database that has been optimized or built for a virtualized environment, either in a hybrid cloud, public cloud or private cloud. Cloud DB’s provide benefits such as the ability to pay for storage capacity and bandwidth on a per-use basis, and they provide scalability on demand, along with high availability.
A cloud DB also gives enterprises the opportunity to support business applications in a software-as-a-service deployment.
NoSQL DB’s are useful for large sets of distributed data.
NoSQL DB’s are effective for big data performance issues that relational DB’s aren’t built to solve. They are most effective when an organization must analyze large chunks of unstructured data or data that’s stored across multiple virtual servers in the cloud.
Items created using object-oriented programming languages are often stored in relational databases, but object-oriented databases are well-suited for those items.
An object-oriented database is organized around objects rather than actions, and data rather than logic. For example, a multimedia record in a relational database can be a definable data object, as opposed to an alphanumeric value.
A graph-oriented, or graph database, is a type of NoSQL DB that uses graph theory to store, map and query relationships. Graph DB’s are basically collections of nodes and edges, where each node represents an entity, and each edge represents a connection between nodes.
Graph DB’s often employ SPARQL, a declarative programming language and protocol for graph DB analytics. SPARQL has the capability to perform all the analytics that SQL can perform, plus it can be used for semantic analysis, the examination of relationships. This makes it useful for performing analytics on data sets that have both structured and unstructured data. SPARQL allows users to perform analytics on information stored in a relational database, as well as friend-of-a-friend (FOAF) relationships, PageRank and shortest path.