Sometimes you need to find out what a record looked like at some point in the past. This is known as the Slowly Changing Dimension problem. Most database models - by design - don’t keep the history of a record when it’s updated. But there are plenty of reasons why you might need to do this, such as audit/security purposes, implementing an undo functionality, showing a model’s change over time for stats or comparison. There are a few ways to do this in PostgreSQL, but this article is going to focus on the implementation provided by the SQL:2011 standard, which added support for temporal databases. It’s also going to focus on actually querying that historical data, with some real-life use cases. PostgreSQL doesn’t support these features natively, but this temporal tables approximates them
A fair, low-latency, multi-tenant queue which operates with multiple shared-nothing workers that claim jobs in an (almost) contention-free way.
A BipBuffer is a bi-partite circular buffer that always supports writing a contiguous chunk of data, instead of potentially splitting a write in two chunks when it straddles the buffer's boundaries. Circular buffers are a common primitive for asynchronous (inter- or intra- thread) communication. Let's start with a very abstract, idealised view of the circular buffer interface, and then consider real-world constraints one by one, till we get to the BipBuffer design.
Large-scale data analytics frameworks are shifting towards shorter task durations and larger degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete in hundreds of milliseconds poses a major challenge for task schedulers, which will need to schedule millions of tasks per second on appropriate machines while offering millisecond-level latency and high availability. We demonstrate that a decentralized, randomized sampling approach provides near-optimal performance while avoiding the throughput and availability limitations of a centralized design. We implement and deploy our scheduler, Sparrow, on a 110-machine cluster and demonstrate that Sparrow performs within 12% of an ideal scheduler.
A fast and flexible allocator for no_std and WebAssembly
Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary point in time and later continue their computation on another compute resource; this is usually referred to as checkpointing. While some applications can manage checkpointing programmatically, it would be preferable if the batch scheduling system could do that independently. This paper evaluates the feasibility of using CRIU (Checkpoint Restore in Userspace), an open-source tool for the GNU/Linux environments, emphasizing the OSG's OSPool HTCondor setup. CRIU allows checkpointing the process state into a disk image and can deal with both open files and established network connections seamlessly. Furthermore, it can checkpoint traditional Linux processes and containerized workloads. The functionality seems adequate for many scenarios supported in the OSPool. However, some limitations prevent it from being usable in all circumstances.
With DotSlash, a set of platform-specific executables is replaced with a single script containing descriptors for the supported platforms. DotSlash handles transparently fetching, decompressing, and verifying the appropriate remote artifact for the current operating system and CPU
IfState is a python3 utility to configure the Linux network stack in a declarative manner. It is a frontend for the kernel’s netlink protocol
Message brokers often mediate communication between data producers and consumers by adding variable-sized messages to ordered distributed queues. Our goal is to determine the number of consumers and consumer-partition assignments needed to ensure that the rate of data consumption keeps up with the rate of data production. We model the problem as a variable item size bin packing problem. As the rate of production varies, new consumer-partition assignments are computed, which may require rebalancing a partition from one consumer to another. While rebalancing a queue, the data being produced into the queue is not read leading to additional latency costs. As such, we focus on the multi-objective optimization cost of minimizing both the number of consumers and queue migrations. We present a variety of algorithms and compare them to established bin packing heuristics for this application. Comparing our proposed consumer group assignment strategy with Kafka's, a commonly employed strategy, our strategy presents a 90th percentile latency of 4.52s compared to Kafka's 217s with both using the same amount of consumers. Kafka's assignment strategy only improved the consumer group's performance with regards to latency with configurations that used at least 60% more resources than our approach.
Pgtemp is a Rust library and cli tool that allows you to easily create temporary PostgreSQL servers for testing without using Docker. The pgtemp Rust library allows you to spawn a PostgreSQL server in a temporary directory and get back a full connection URI with the host, port, username, and password.
This article explores the live migration steps QEMU performs and how it tracks the information it needs to make the process transparent. It explains how QEMU coordinates with vhost-kernel, the device already described in the vhost-net deep dive. I will show how the device can report all the data required for the destination QEMU to continue the device operation. I will also explain how the guest can switch device properties, such as MAC address or number of active queues, and resume the workload seamlessly in the destination.
As cloud computing usage grows, cloud data centers play an increasingly important role. To maximize resource utilization, ensure service quality, and enhance system performance, it is crucial to allocate tasks and manage performance effectively. The purpose of this study is to provide an extensive analysis of task allocation and performance management techniques employed in cloud data centers. The aim is to systematically categorize and organize previous research by identifying the cloud computing methodologies, categories, and gaps. A literature review was conducted, which included the analysis of 463 task allocations and 480 performance management papers. The review revealed three task allocation research topics and seven performance management methods. Task allocation research areas are resource allocation, load-Balancing, and scheduling. Performance management includes monitoring and control, power and energy management, resource utilization optimization, quality of service management, fault management, virtual machine management, and network management. The study proposes new techniques to enhance cloud computing work allocation and performance management. Short-comings in each approach can guide future research. The research's findings on cloud data center task allocation and performance management can assist academics, practitioners, and cloud service providers in optimizing their systems for dependability, cost-effectiveness, and scalability. Innovative methodologies can steer future research to fill gaps in the literature.
This documents the settings we use at Let's Encrypt to create ZFS backing storage for MariaDB, and the tips and best practices that led us here.
Standard Webhooks is a set of open source tools and guidelines to send webhooks easily, securely and reliably. Webhooks are becoming increasingly popular, though every webhooks provider implements them differently and with varying quality. This makes it hard for providers who need to reinvent the wheel every time and repeat the same costly mistakes, and annoying for consumers who need to have a different implementation for each provider. It's also holding back the ecosystem as a whole, as these incompatibilities mean that no tools are being built to help senders send, consumers consume, and for everyone to innovate on top.
The following guide covers how to install and deploy OpenPubkey SSH to enable SSH access without the use of SSH keys.
We present Schism, a novel workload-aware approach for database partitioning and replication designed to improve scalability of shared-nothing distributed databases. Because distributed transactions are expensive in OLTP settings (a fact we demonstrate through a series of experiments), our partitioner attempts to minimize the number of distributed transactions, while producing balanced partitions. Schism consists of two phases: i) a workload-driven, graph-based replication/partitioning phase and ii) an explanation and validation phase. The first phase creates a graph with a node per tuple (or group of tuples) and edges between nodes accessed by the same transaction, and then uses a graph partitioner to split the graph into k balanced partitions that minimize the number of cross-partition transactions. The second phase exploits machine learning techniques to find a predicate-based explanation of the partitioning strategy (i.e., a set of range predicates that represent the same replication/partitioning scheme produced by the partitioner). The strengths of Schism are: i) independence from the schema layout, ii) effectiveness on n-to-n relations, typical in social network databases, iii) a unified and fine-grained approach to replication and partitioning. We implemented and tested a prototype of Schism on a wide spectrum of test cases, ranging from classical OLTP workloads (e.g., TPC-C and TPC-E), to more complex scenarios derived from social network websites (e.g., Epinions.com), whose schema contains multiple n-to-n relationships, which are known to be hard to partition. Schism consistently outperforms simple partitioning schemes, and in some cases proves superior to the best known manual partitioning, reducing the cost of distributed transactions up to 30%.
Choreographies are coordination plans for concurrent and distributed systems. A choreography defines the roles of the involved participants and how they are supposed to work together. In the emerging paradigm of choreographic programming (CP), choreographies are programs that can be compiled to executable implementations
This article will tell you how to implement a simple controller in software and how to tune it without getting into heavy mathematics and without requiring you to learn any control theory. The technique used to tune the controller is a tried and true method that can be applied to almost any control problem with success.
It's a non-sharded, strict serializable, fault tolerant, key-value store that supports point writes, reads and range reads. Notice that it provides a key-value API (not SQL). It's also not sharded, meaning the entire key space is essentially on one logical shard. That's it. Once you have a strict serializable key-value store, you can layer a SQL engine and secondary indexes on top. A strict serializable (can be relaxed if needed obviously) key-value store is the foundation (a smaller reusable component), upon which you can build distributed databases almost[1] however you want. This is a great design choice.
Fck-nat offers a ready-to-use ARM and x86 based AMIs built on Amazon Linux 2023 which can support up to 5Gbps burst NAT traffic on a t4g.nano instance
"Rootless containers" is a concept to run the entire container runtimes and containers without the root privileges. It protects the host environment from attackers exploiting container runtime vulnerabilities. However, when rootless containers communicate with external endpoints, the network performance is low compared to rootful containers because of the overhead of rootless networking components. In this paper, we propose bypass4netns that accelerates TCP/IP communications in rootless containers by bypassing slow networking components. bypass4netns uses sockets allocated on the host. It switches sockets in containers to the host's sockets by intercepting syscalls and injecting the file descriptors using Seccomp. Our method with Seccomp can handle statically linked applications that previous works could not handle. Also, we propose high-performance rootless multi-node communication. We confirmed that rootless containers with bypass4netns achieve more than 30x faster throughput than rootless containers without it. In addition, we evaluated performance with applications and it showed large improvements on some applications.
Cryptography-x509-verification, a brand-new, pure-Rust implementation of the X.509 path validation algorithm that TLS and other encryption and authentication protocols are built on. Our implementation is fast, standards-conforming, and memory-safe, giving the Python ecosystem a modern alternative to OpenSSL’s misuse- and vulnerability-prone X.509 APIs for HTTPS
The pgloader tool is meant to allow one to implement the Continuous Migration project methodology when migrating to PostgreSQL. This methodology is meant to reduce risks inherent to such complex projects. After having been involved in many migration projects in the past, I decided to publish a White Paper about this project methodology!
Technitium DNS Server is an open source authoritative as well as recursive DNS server that can be used for self hosting a DNS server for privacy & security. It works out-of-the-box with no or minimal configuration and provides a user friendly web console accessible using any modern web browser.
I’ve spent quite a lot of time messing with x86_64 page tables, understanding address translation is not easy and when I started learning about it I felt like a lot of the material out there on how it works was hard for me to wrap my head around. So in this blog post I am going to attempt to provide a kind of “what I wish I had when learning about paging”.
In games, PIDs can be used for a simulation of any of the real world purposes. It can also be used to give a human-like feel to an AI.
People involved in the PostgreSQL community, as organizations or individuals, are posting vast amounts of useful information on blogs, wikis, and websites. However, they are scattered, and it will not be easy to find the information you are looking for. Therefore, this page has compiled a collection of links to articles that the editor(s) informative, from hundreds of blog sites registered on Planet PostgreSQL, plus the PostgreSQL wiki and websites. Some notable topics from these articles are picked up here, trying to organize and summarize them for introduction purposes
Dynamic programming itself is mostly natural when you understand what it does. And many common algorithms are actually just the application of dynamic programming to specific problems, including omnipresent path-finding algorithms such as Dijkstra’s algorithm.
In the past few years, I've heard a lot about Avro, Parquet, ORC, Arrow and Feather, but I also keep hearing about Iceberg and Delta Lake. As a "database person", I’ve been struggling to understand all of these different things, and how they relate to Data Lakes and Data Lakehouses (and what exactly are these?). So, I’ve decided to study them, and consolidate my knowledge in writing.
Most database management systems cache pages from storage in a main memory buffer pool. To do this, they either rely on a hash table that translates page identifiers into pointers, or on pointer swizzling which avoids this translation. In this work, we propose vmcache, a buffer manager design that instead uses hardware-supported virtual memory to translate page identifiers to virtual memory addresses. In contrast to existing mmap-based approaches, the DBMS retains control over page faulting and eviction. Our design is portable across modern operating systems, supports arbitrary graph data, enables variable-sized pages, and is easy to implement. One downside of relying on virtual memory is that with fast storage devices the existing operating system primitives for manipulating the page table can become a performance bottleneck. As a second contribution, we therefore propose exmap, which implements scalable page table manipulation on Linux. Together, vmcache and exmap provide flexible, efficient, and scalable buffer management on multi-core CPUs and fast storage devices.
If you're like me - you can't type your complex password correctly when your entire team is staring at you on a pair coding call. When you're done reading this post, you'll never need to again. Instead, you'll tap your Yubikey to execute a sudo command without ever touching a password prompt. Next, I'll show you how to automatically sign your GitHub commits with the private PGP key that only exists physically on my Yubikey 5 NFC and which cannot be exported from the device
Binary fuse filters & xor filters are probabilistic data structures which allow for quickly checking whether an element is part of a set. Both are faster and more concise than Bloom filters, and smaller than Cuckoo filters.
Open Source backend for your next SaaS and Mobile app in 1 file
Developers often deploy database-specific network proxies whereby applications connect transparently to the proxy instead of directly connecting to the database management system (DBMS). This indirection improves system performance through connection pooling, load balancing, and other DBMS-specific optimizations. Instead of simply forwarding packets, these proxies implement DBMS protocol logic (i.e., at the application layer) to achieve this behavior. Consequently, existing proxies are user-space applications that process requests as they arrive on network sockets and forward them to the appropriate destinations. This approach incurs inefficiencies as the kernel repeatedly copies buffers between user-space and kernel-space, and the associated system calls add CPU overhead. This paper presents user-bypass, a technique to eliminate these overheads by leveraging modern operating system features that support custom code execution.
Serverless computing, commonly offered as Function-as-a-Service, was initially designed for small, lean applications. However, there has been an increasing desire to run larger, more complex applications (what we call bulky applications) in a serverless manner. Existing strategies for enabling such applications are to either increase function sizes or to rewrite applications as DAGs of functions. These approaches cause significant resource wastage, manual efforts, and/or performance overhead. We argue that the root cause of these issues is today's function-centric serverless model, where a function is the resource allocation and scaling unit. We propose a new, resource-centric serverless-computing model for executing bulky applications in a resource- and performance-efficient way, and we build the Zenix serverless platform following this model. Our results show that Zenix reduces resource consumption by up to 90% compared to today's function-centric serverless systems, while improving performance by up to 64%.
I am surveying real-world serverless, multi-tenant data architectures to understand how different types of systems, such as OLTP databases, real-time OLAP, cloud data warehouses, event streaming systems, and more, implement serverless MT
Log-structured merge (LSM) trees have emerged as one of the most commonly used storage-based data structures in modern data systems as they offer high throughput for writes and good utilization of storage space. However, LSM-trees were not originally designed to facilitate efficient reads. Thus, state-of-the-art LSM engines employ numerous optimization techniques to make reads efficient. The goal of this tutorial is to present the fundamental principles of the LSM paradigm along with the various optimization techniques and hybrid designs adopted by LSM engines to accelerate reads. Toward this, we first discuss the basic LSM operations and their access patterns. We then discuss techniques and designs that optimize point and range lookups in LSM-trees
A simple Service Level Calculator that complies with Google SRE books.
The IETF is introducing a new type of record for DNS called SVCB/HTTPS. this SVCB/HTTPS record can solve some very important problems and I have been following the advancement of the standard. Today I’ll give you an introduction
On mainstream 64-bit systems, the maximum bit-width of a virtual address is somewhat lower than 64 bits (commonly 48 bits). This gives an opportunity to repurpose those unused bits for data storage, if you're willing to mask them out before using your pointer (or have a hardware feature that does that for you - more on this later). I wondered what happens to userspace programs relying on such tricks as processors gain support for wider virtual addresses, hence this little blog post. TL;DR is that there's no real change unless certain hint values to enable use of wider addresses are passed to mmap, but read on for more details as well as other notes about the general topic of storing data in pointers.
Pgroll is an open source command-line tool that offers safe and reversible schema migrations for PostgreSQL by serving multiple schema versions simultaneously. It takes care of the complex migration operations to ensure that client applications continue working while the database schema is being updated. This includes ensuring changes are applied without locking the database, and that both old and new schema versions work simultaneously (even when breaking changes are being made!). This removes risks related to schema migrations, and greatly simplifies client application rollout, also allowing for instant rollbacks.
We recently upgraded from Postgres 11.9 to 15.3 with zero downtime by using logical replication, a suite of support scripts, and tools in Elixir & Erlang’s BEAM virtual machine. This post will go into far too much detail explaining how we did it, and considerations you might need to make along the way if you try to do the same.
Logical decoding is a mechanism that enables users to stream changes on Postgres as a sequence of logical operations like INSERTs, UPDATEs, and DELETEs. This is useful for applications that need to keep an external data store synchronized with Postgres. For example, replicating data in Postgres to Data Warehouses for analytics.
Researchers and practitioners care deeply about the performance and correctness of microservice applications. To investigate problematic application behavior and prototype potential improvements, researchers and practitioners experiment with different designs, implementations, and deployment configurations. We argue that a key requirement for microservice experimentation is the ability to rapidly reconfigure applications and to iteratively Configure, Build, and Deploy (CBD) new variants of an application that alter or improve its design. We focus on three core experimentation use-cases: (1) updating the design to use different components, libraries, and mechanisms; (2) identifying and reproducing problematic behaviors caused by different designs; and (3) prototyping and evaluating potential solutions to such behaviors. We present Blueprint, a microservice development toolchain that enables rapid CBD. With a few lines of code, users can easily reconfigure an application’s design; Blueprint then generates a fully-functioning variant of the application under the new design. Blueprint is open-source and extensible; it supports a wide variety of reconfigurable design dimensions. We have ported all major microservice benchmarks to it. Our evaluation demonstrates how Blueprint simplifies experimentation use-cases with orders-of-magnitude less code change.
If you’re just going to be reading from a possibly-stale cache at some level above, did you really need etcd at all?
This document argues that, while decentralized technical standards may be necessary to avoid centralization of Internet functions, they are not sufficient to achieve that goal because centralization is often caused by non-technical factors outside the control of standards bodies. As a result, standards bodies should not fixate on preventing all forms of centralization; instead, they should take steps to ensure that the specifications they produce enable decentralized operation
Time-Sorted Unique Identifiers (TSID).It brings together ideas from Twitter's Snowflake and ULID Spec.In summary: Sorted by generation time; Can be stored as an integer of 64 bits; Can be stored as a string of 13 chars; String format is encoded to Crockford's base32; String format is URL safe, is case insensitive, and has no hyphens; Shorter than UUID, ULID and KSUID.
We present Prequal (Probing to Reduce Queuing and Latency), a load balancer for distributed multi-tenant systems. Prequal aims to minimize real-time request latency in the presence of heterogeneous server capacities and non-uniform, time-varying antagonist load. It actively probes server load to leverage the power-of-d-choices paradigm, extending it with asynchronous and reusable probes. Cutting against received wisdom, Prequal does not balance CPU load, but instead selects servers according to estimated latency and active requests-in-flight (RIF). We explore its major design features on a testbed system and evaluate it on YouTube, where it has been deployed for more than two years. Prequal has dramatically decreased tail latency, error rates, and resource use, enabling YouTube and other production systems at Google to run at much higher utilization.
With this new feature, you can setup a Postgres Read Replica and start replicating data from it instead of the Primary. It completely eliminates the risk of affecting your Primary database.
The core thesis of the "midlayer mistake" is that midlayers are bad and should not exist. That common functionality which it is so tempting to put in a midlayer should instead be provided as library routines which can used, augmented, or ignored by each bottom level driver independently. Thus every subsystem that supports multiple implementations (or drivers) should provide a very thin top layer which calls directly into the bottom layer drivers, and a rich library of support code that eases the implementation of those drivers. This library is available to, but not forced upon, those drivers.
In this world, CI as a SaaS feels like accidental complexity of midlayer mistake variety. Can we make it simpler? Can we say that CI is just a “program” for a distributed computer? So, in your project’s repo, there’s a ./ci folder with a such program — a bunch of Docker files, or .yamls, or whatever is the “programming language of the cloud”. You then point, say, AWS to it, tell it “run this, here are my credentials”, and you get your entire CI infra, with not rocket science rule, continuous fuzzing, releases, and what not. And, crucially, whatever project specific logic you need — AWS doesn’t care what it runs, everything is under your control.
Job scheduling in cloud computing environments is a critical yet complex problem. Cloud computing user job requirements are highly dynamic and uncertain, while cloud computing resources are heterogeneous and constrained. This paper studies the online resource allocation problem for elastic computing jobs with soft deadlines in cloud computing environments. The main contributions include: 1) Integer linear programming modeling is used to design an auction time scheduling framework with three key modules - resource allocation, evaluation, and operation, which can dynamically allocate resources in closed loops. 2) Methods such as time-based single resource utilization evaluation and weighted average evaluation are proposed to evaluate resource usage efficiency. 3) Soft acceptance protocols are introduced to achieve elastic online resource allocation. 4) The time complexity of the proposed algorithms is analyzed and proven to be polynomial time, demonstrating efficiency. 5) Modular design makes the framework extensible. This paper provides a structured cloud computing auction framework as a reference for building practical cloud resource management systems. Future work may explore more complex models of random arrival and multi-dimensional resource constraints, evaluate algorithm performance on real cloud workloads, and further enhance system robustness, efficiency and fairness.
Have you ever wished you had the convenience of Unix Domain Sockets even when transmitting data between cluster nodes? Where you yourself determine the addresses you want to bind to and use? Where you don't have to perform DNS lookups and worry about IP addresses? Where you don't have to start timers to monitor the continuous existence of peer sockets? And yet without the downsides of that socket type, such as the risk of lingering inodes? Welcome to the Transparent Inter Process Communication service, TIPC in short, which gives you all of this, and a lot more.
An eviction algorithm that is simpler than LRU for we caches
I was reading Tin Tvrtković’s article1 on making attr2 instances frozen at compile time. He mentions how to leverage mypy to enforce instance immutability statically and use mutable attr classes at runtime to avoid any instantiation cost. I wanted to see if I could do the same with standard data classes.
A dynamic tracer for Linux
A high-performance, recursive DNS resolver server with DNSSEC support, focused on preserving privacy
Web-based tool to facilitate OpenTelemetry collector configuration editing and verification.
High-velocity, monorepo-scale workflow for Git
In this “living guide”, I aim to cover the LTO-related features I have encountered thus far. I intend to keep updating this going forward as I learn about new details in this space.
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs.
The WasmFX project extends WebAssembly (abbreviated Wasm) with effect handlers as a unifying mechanism to enable efficient compilation of control idioms, such as async/await, generators/iterators, first-class continuations, etc.
I’ll start by saying that I think the idea behind OpenPubKey is extremely cool and demonstrates the (basic) workability of a technique (binding an ephemeral signing key to a semi-permanent identity in a globally verifiable way without additional trusted services) that I think is both extremely useful and powerful. At the same time, I have concerns about how OpenPubKey’s privacy properties, its actual ability to provide reliable “keyless” signatures, and its compatibility with and implications for OIDC practices within IdPs. This post is an attempt to elaborate on those concerns.
OpenPubkey is the web's new technology for adding public keys to standard SSO interactions with Identity Providers (IdPs) that use OpenID Connect (OIDC). OpenPubkey works by essentially turning an IdP into a Certificate Authority (CA). A CA is a trusted entity that issues certificates that cryptographically bind an identity with a cryptographic public key. With OpenPubkey, any IdP that supports OpenID Connect can bind public keys to identities. OpenPubKey introduces a new cryptographic object — the PK Token — that binds an identity (i.e., a user or a workload) to its public key. OpenPubkey works with today’s major IdPs, because it doesn't require any changes to OpenID Connect.
His project contains a durable coroutine compiler and runtime library for Go. The coroutine package can be used as a simple library to create coroutines in a Go program, allowing the function passed as entry point to the coroutine to be paused at yield points and later resumed by the caller. When pausing, the coroutine yields a value that is received by the caller, and on resumption the caller can send back a value that the coroutine obtains as result.
The URL Pattern Standard provides a web platform primitive for matching URLs based on a convenient pattern syntax.
Finch is an open source tool for local container development. Finch aims to help promote innovative upstream container projects (including Lima, nerdctl, containerd and BuildKit) by making it easy to install and use them. Finch provides a simple native client to tie it all together.
A classic design of cloud-native databases adopts an architecture that consists of one read/write (RW) node and one or more read-only (RO) nodes. In such a design, the propagation of write-ahead logs (WALs) from the RW node to the RO node(s) is typically performed asynchronously. Consequently, system designers either have to accept a loose consistency guarantee, where a read from the RO node may return stale data, or tolerate significant performance degradation in terms of read latency, as it then needs to wait for the log to be propagated and applied. Most commercial cloud-native databases, such as Amazon Aurora, choose performance over strong consistency. As a result, it makes RO nodes useless for many applications requiring read-after-write consistency (a form of strong consistency), and the support for serverless databases (i.e., allowing the RO nodes to be scaled out automatically) is impossible as they require a single endpoint. This paper proposes PolarDB-SCC (PolarDB-Strongly Consistent Cluster), a cloud-native database architecture that guarantees strongly consistent reads with very low latency. The core idea is to eliminate unnecessary waits and reduce the necessary wait time on RO nodes while still supporting strong consistency. To achieve this, it tracks the RW node’s modification timestamp at three progressively finer-grained levels.
Git can “trace the evolution” of a specific method when passed -L using the :<funcname> variant. Let’s take a look at the history of this blog’s posts’ published scope, which is used in most places that articles are listed out. You can see that git is able to identify the boundaries of the method even as the length changes and it prints out the full method with diff from each commit in the log, which is filtered to changes to that method.
Multi-stage serverless applications, i.e., workflows with many computation and I/O stages, are becoming increasingly representative of FaaS platforms. Despite their advantages in terms of fine-grained scalability and modular development, these applications are subject to suboptimal performance, resource inefficiency, and high costs to a larger degree than previous simple serverless functions. We present Aquatope, a QoS-and-uncertainty-aware resource scheduler for end-to-end serverless workflows that takes into account the inherent uncertainty present in FaaS platforms, and improves performance predictability and resource efficiency. Aquatope uses a set of scalable and validated Bayesian models to create pre-warmed containers ahead of function invocations, and to allocate appropriate resources at function granularity to meet a complex workflow's end-to-end QoS, while minimizing resource cost. Across a diverse set of analytics and interactive multi-stage serverless workloads, Aquatope significantly outperforms prior systems, reducing QoS violations by 5x, and cost by 34% on average and up to 52% compared to other QoS-meeting methods.