👩‍💻 📚 Data Transfer Overview

Data-driven engineering is the process of using data to create decisions, guide processes, and optimize operations. It is an emerging field that is transforming the way engineers interact with their environment. In order to make use of data-driven engineering, data must be transferred from one system to another. This can be accomplished using a variety of methods, including Modbus, MQTT, OPC UA, WebSockets. This section provides an overview of each of these transfer methods and how they can be used in data-driven engineering applications.

1️⃣ MODBUS

Modbus is a communications protocol used to connect industrial electronic devices such as PLCs (Programmable Logic Controllers), RTUs (Remote Terminal Units) and IEDs (Intelligent Electronic Devices) to each other and to computers. It is an open protocol that defines a set of rules for communication between electronic devices. It was originally developed by Modicon in 1979 for use with its programmable logic controllers (PLCs). Modbus is still widely used today in industrial automation and control applications.

2️⃣ MQTT

MQTT (Message Queuing Telemetry Transport) is a lightweight publish/subscribe messaging protocol used for machine-to-machine (M2M) communication. It is designed to be simple, efficient, and flexible, making it ideal for data-driven engineering applications. MQTT allows devices to publish and subscribe to messages, allowing data to be transferred in real-time.

3️⃣ OPC UA

OPC UA (Open Platform Communications Unified Architecture) is a secure, reliable, and scalable communication protocol for industrial automation. It is used to transfer data from field devices to enterprise systems.

4️⃣ WebSocket

Websockets are a protocol used for establishing a two-way communication channel between a client and a server. This protocol is used for creating real-time applications that can send and receive data in real-time. Websockets allow for bi-directional communication, enabling data to be sent in both directions over a single connection, allowing for a more efficient use of bandwidth and improved performance.

Batch or Stream

Data-driven engineering requires communication of information from sensors to repositories where it can be processed into actionable information. Data can be streamed continuously or sent in batches. Encryption protects data and security protocols ensure that unaltered data arrives only to the correct destination.

Transport Layers

There are several protocols for transfer of data, depending on the objective of the application.

Two underlying transport layers are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol).

TCP is a standard that defines how to establish and maintain a network connection via the Internet. TCP is the most commonly used protocol for data transfer and is the backbone of the Internet. Computers send TCP packets to a web server and the web server responds with TCP packets back to the computers. Before communicating, a persistent connection is established between two devices before data is exchanged. TCP uses error correction to ensure that all packets are delivered successfully. If a packet is lost or corrupted, TCP resends the data until delivery is confirmed.

UDP is a real-time streaming connection that does not enforce data integrity. It is a connectionless transport layer protocol that does not require a connection to be established and does not guarantee that data is delivered in order. The focus of UDP is to deliver the most recent data instead of data integrity and completeness.

Many communication protocols are built on TCP and UDP transport layers. Common communication protocols in automation systems are MODBUS, OPC (OLE for Process Control), and WebSockets. MODBUS, MQTT, and OPC UA are specific types of more general data transfer frameworks. Additional applications facilitate data subscription and sharing such as ROS (Robot Operating System) and multi-platform clients such as ROS Bridge library. Developers also use REST, GraphQL, and gPRC to transfer structured data between applications as more flexible frameworks. Organizations such as the Society of Automotive Engineers (SAE) has defined standards for communicating with passenger and off-road vehicles. Each communication standard is governed by an organization that manages the source or specifications for each protocol.

Application Programming Interfaces (APIs)

REST (Representational State Transfer) is a software architecture for Application Programming Interfaces (APIs). REST is a stateless architecture for data transfer where prior transactions are not referenced or stored. REST advantages include a well-established standard, simple use, most popular, and caching support. Developers use REST for building CRUD (Create, Read, Update, Delete) web applications with well-structured data. Node.js (JavaScript) uses Express for common web server tasks a REST API back end. Python has the Bottle or Flask framework for small web applications and rapid prototyping. Python client is typically written with the requests package to send or retrieve data. Clients mostly request over HTTPs and communicate with JSON for human-readable structured data.

GraphQL is an open-source query language for APIs that has defined data structures. GraphQL is client-driven where the client uses strongly typed schemas. It was originally developed by Meta and is now hosted by the non-profit Linux Foundation. It is supported by many programming languages, including Python. Developers use GraphQL for public APIs that need to be flexible in customizing requests from different sources.

gPRC is a lightweight and efficient protocol for obtaining data. It relies on contracts in the relationship between the server and the client. Handling and calculations are performed on a remote server that hosts the resource, but most of the power is used on the client side. The main gPRC advantages are lightweight client and server code such as Python, highly efficient protocol for buffers to send/receive data, and open-source code. Developers use gPRC for private APIs that have actions and where performance is important.

Automotive and Truck Data Transfer

Some machines have standardized data transfer interfaces such as OBD-II connections to automobiles with SAE J2480 protocol or SAE J1939 protocol for off-road heavy equipment and trucks. These protocols define a set of standards for how Electronic Control Units (ECUs) communicate through the CAN bus in vehicles.