Solving Critical IoT Challenges with Memfault and Golioth

In the ever-evolving world of the Internet of Things (IoT), connected devices are becoming increasingly ubiquitous, serving an ever-expanding array of use cases and satisfying the growing demand for smart, connected solutions. From smart home devices to industrial sensors, these IoT devices have transformed the way we interact with the world around us. However, as the number of these devices surges, so too do the challenges associated with managing and maintaining them.

The current state of the IoT hardware and software ecosystem presents us with both opportunities and complexities. On one hand, the explosion of IoT-enabling technology has paved the way for low-power, constrained devices, enabling novel applications and the collection of data that was previously unimaginable. On the other hand, the proliferation of these devices, often deployed in remote or inaccessible locations, presents a new set of obstacles.

Understanding why an IoT device is not functioning properly is a particularly vexing issue in this landscape, especially when physical access is limited or non-existent. Moreover, product and regulatory requirements now mandate the ability to swiftly identify and address issues in these devices, whether they are malfunctioning in a smart home environment, a manufacturing facility, or a healthcare setting. And users (rightfully so!) expect high-quality, reliable devices.

This is where Memfault and Golioth come into play. In this blog post, we will delve into how these two solutions are revolutionizing the management of IoT firmware, effectively addressing two of the most critical components of the IoT ecosystem: device connectivity and device diagnostics.

We’ll explore how Memfault and Golioth are reshaping the IoT landscape, offering innovative ways to manage, diagnose, and interface with IoT devices. In doing so, they are enabling organizations to streamline operations, enhance device reliability, and ultimately deliver a better experience to their end-users.

Recipe for Success

Establishing a game plan when developing a product that adheres to this growing list of requirements is critical. There are three main objectives that must be met to ensure that a device is secure, reliable, and useful to the end user.

product development - Golioth and Memfault

The first step is knowing when a device is misbehaving. While there are many ways in which a device can be operating incorrectly, some examples include crashing unexpectedly, rapidly draining battery, and failing to send data. In many cases, these behaviors will render the device useless to the end-user.

The next step is being able to rapidly diagnose observed issues. Given that IoT devices are frequently resource and bandwidth-constrained, sending large amounts of data to communicate non-application information is frequently a non-starter. Instead, developers and their organizations need to be able to quickly relate the minimal metadata delivered by devices to supplemental information in order to effectively troubleshoot. As a fleet grows, this can become a daunting task.

Finally, after identifying and diagnosing an issue, an update that remediates the problem must be delivered to the device, allowing it to return to its normal operating state. Though a single device may exhibit negative behavior, many more may be susceptible. Effectively grouping and targeting all or part of a fleet with an update is necessary to ensure that the issue does not persist.

Why Golioth & Memfault?

All of the goals in our recipe for success have a common dependency: connectivity. Without the ability to communicate with a device in the field, it is impossible to know whether it is behaving correctly and, if it isn’t, deploy an update to correct the problem. However, connectivity comes at a cost. Not only is there a literal expense for bandwidth usage, especially when using cellular networks, but there are also impacts on power, memory, and compute. And if you want communication between your devices and the cloud to be secure (trust us, you do!), those impacts can be even greater.

Golioth is a platform built to simplify the process of managing devices and establishing connectivity, enabling an efficient and secure channel between a device and any service it needs to talk to.

The Lifecycle of a Device on Golioth

Device management starts with provisioning, which can look different for products based on how they are manufactured and when and where they are deployed. Production devices use certificate-based authentication on Golioth, meaning that an administrator can create entries for devices via the Golioth API, or they can rely on zero-touch provisioning, in which a device is automatically created in their Golioth project the first time it connects.

Zero touch provisioning - Golioth

Certificate-based authentication works similarly to the manner in which your browser established a secure connection with this site when you followed the HTTPS link to this blog post. Depending on your browser, a bundle of root certificates associated with Certificate Authorities (CAs) is either bundled into the browser or available from your machine’s operating system. Websites you visit using your browser present a certificate that is signed by a certificate on the same chain as one of the CA certificates in the bundle. This allows your browser to verify that you are actually connected to blog.memfault.com, and not a site impersonating it.

On Golioth, users upload CA certificates to establish a trusted bundle. Then, when a device connects and presents its certificate, the platform performs a similar validation step as your browser to ensure that the certificate is signed by one of the trusted authorities. If valid, the metadata contained in the device certificate can be used to create a new device entry in the case of zero-touch provisioning.

Once connected, devices can be observed and managed using Golioth’s suite of device management services. These include logging, settings, over-the-air updates (OTA), remote procedure calls, and more. Each of these services plays an important role in providing users with a complete view of their fleet, and the ability to update devices as necessary. To enable this, devices need to be able to send data to Golioth, as well as receive data from the platform.

Golioth platform

All communication with Golioth happens over a single secure connection using the same certificates that were leveraged for initial provisioning. Because there are many different use cases for IoT devices, Golioth uses the Constrained Application Protocol (CoAP) over DTLS / UDP, which provides similar security guarantees to protocols such as HTTP or MQTT over TLS / TCP, while moving reliable delivery configuration to the application layer. This enables devices to ensure the delivery of critical data, while not wasting precious resources on non-critical data. Some CoAP networking stacks are also more compact and efficient than competing protocols due to reduced complexity in message handling.

CoAP Message - Golioth

These attributes combine to ensure that devices connected to Golioth are communicating securely, maximizing available resources, and using the network efficiently. Not only does this enable users to leverage the hardware of their devices for more capabilities, but it also can result in reduced power consumption, allowing devices to continue operating in the field for longer.

Golioth: The Universal Connector

While Golioth offers a robust set of services for managing your device fleet, many of the benefits it provides would be mitigated by the need to establish additional connections, and potentially introduce other networking stacks, when it is necessary for devices to talk to other platforms and services. As the IoT landscape continues to change rapidly, new use cases and requirements increase the likelihood that data flowing to and from a device will need to be routed to multiple locations. Ideally, all of that data could leverage the efficiency offered by Golioth’s single secure channel.

Today, Golioth enables routing data from devices to external services via output streams. For example, many Golioth users leverage public cloud providers to build the backend infrastructure for their connected applications. The services offered by these providers are typically not catering to communication patterns of highly constrained IoT devices. Furthermore, if data needs to flow to multiple destinations that speak different protocols and require different authentication schemes, establishing and managing those connections on-device quickly becomes untenable.

Connecting Golioth and Memfault

Instead, Golioth enables devices to only consider a single destination, then empowers users to transform and route that data in the cloud where the majority of constraints are removed. Besides maximizing device efficiency, this also enables the end-to-end product to grow and evolve over time, without having to significantly modify the behavior of devices in a fleet. For every aspect of the product and stage of development, users are able to use the best tools for the job.

As previously described, the need for devices to communicate data related to how they are operating is becoming more of a focal point. While Golioth enables devices to send that data efficiently, minimizing the volume of data, and being able to tie it to supplemental information in the cloud is vital. Memfault has long been the preferred platform for device diagnostics, telemetry, and crash analysis. In other words, it is the best tool for the job.

Memfault for Rich Device Diagnostics

Memfault Cloud and Golioth

Memfault’s solution for Device Diagnostics and Telemetry prioritizes these principles, which we believe are crucial for providing valuable insights on device reliability and performance:

  • Compatible with as many IoT devices and connectivity paths as possible
  • Efficient on-device data collection and transmission
  • Streamlined web dashboards for super simple feedback on device health

We’ve refined our on-device SDKs to capture the most valuable data for the problems we face in the IoT space, without burdening the system. Memfault compresses the data stored on the device, including crash reports, metrics, and log data. This saves memory and storage on the device itself and also conserves bandwidth, which is crucial for low-power or metered connections, such as LTE, BLE, Zigbee, LoRaWAN, etc. Read more about Memfault’s data serialization design here!

Memfault’s MCU SDK targets lightweight and resource-constrained devices like ARM Cortex-M chips with a few kilobytes of flash and RAM. It’s set up to be highly configurable, making the best use of available memory and storage. We have users drawing useful insights with only a few hundred bytes of data. For an example, see this presentation using Memfault on remote-operated buoys with very tight bandwidth requirements.

In a default configuration, the MCU SDK consumes approximately 4.5kB of flash and 1.5kB of RAM. Memfault's SDK is compatible with both bare-metal (no RTOS) systems as well as all the popular modern RTOSs. We strive to meet users where they're at — we only require a C99-compatible compiler to be able to add our MCU SDK to your system!

Memfault pushes as much of the difficult processing and aggregation work into our cloud platform, which means we can provide detailed diagnostic analysis with just a few bytes of data. Here’s an example coredump which was generated from just a few hundred bytes of data:

Coredump Collection - Memfault

Read more about Memfault Coredumps here.

On-device metrics can be added with just 2 lines of C code, and are automatically decoded by Memfault’s platform—no cloud or backend effort is needed. The metrics data reported by individual devices can be aggregated and sliced by configurable properties, making it easy to draw insights at both a detailed, single-device level as well as across groups of devices (for example, different firmware versions), or trends across an entire fleet of devices.

Metrics Chart - Memfault

We aim to easily drop our SDK into your project and get meaningful insights with as little overhead as possible. See our onboarding guides here for detailed getting-started instructions. On platforms like NXP's MCUXpresso SDK, Zephyr RTOS, Nordic’s nRF-Connect SDK, and Espressif’s ESP-IDF SDK, you can be up and running in as little as a few minutes!

Memfault diagnostic data can be uploaded directly from the instrumented device, or passed through any intermediate data pathway. We support routing data via CoAP, MQTT, BLE, Zigbee, LoRA, Matter, Thread, etc., just as long as the data is eventually posted to Memfault’s HTTP endpoint, we can process it. Golioth provides a streamlined and efficient pathway to route Memfault data, without needing to worry about configuring the routing, making it really easy to start uploading Memfault diagnostic data.

Wrap up, get going with Golioth and Memfault 🚀

We will be providing more information and a demonstration of the Golioth-Memfault integration at this Thursday’s webinar. If you are interested in getting early access, make sure to add yourself to the waitlist. To get started using both platforms today, check out the Golioth and Memfault documentation!

Back to Blog

Utilizing Device Reliability Engineering to Scale IoT with Confidence

In our previous article, we introduced the concept of device reliability engineering (DRE) and...

What is DRE? An Introduction to Device Reliability Engineering

As edge computing and hardware capabilities enable new IoT use cases across consumer life and...

Accelerating IoT Device Development through NPI and Firmware Milestones

The Internet of Things (IoT) continues to gain momentum and transform how we work and live, driven...