Skip to content

Scintilla06/DailyArXiv

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

667 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Daily Papers

The project automatically fetches the latest papers from arXiv based on keywords.

The subheadings in the README file represent the search keywords.

Only the most recent articles for each keyword are retained, up to a maximum of 100 papers.

You can click the 'Watch' button to receive daily email notifications.

Last update: 2026-03-26

Knapsack

Title Date Abstract Comment
Deterministic Algorithm for Non-monotone Submodular Maximization under Matroid and Knapsack Constraints 2026-03-16
Show

Submodular maximization constitutes a prominent research topic in combinatorial optimization and theoretical computer science, with extensive applications across diverse domains. While substantial advancements have been achieved in approximation algorithms for submodular maximization, the majority of algorithms yielding high approximation guarantees are randomized. In this work, we investigate deterministic approximation algorithms for maximizing non-monotone submodular functions subject to matroid and knapsack constraints. For the two distinct constraint settings, we propose novel deterministic algorithms grounded in an extended multilinear extension framework. Under matroid constraints, our algorithm achieves an approximation ratio of $(0.385 - ε)$, whereas for knapsack constraints, the proposed algorithm attains an approximation ratio of $(0.367 -ε)$. Both algorithms run in $\mathrm{poly}(n)$ query complexity, where $n$ is the size of the ground set, and improve upon the state-of-the-art deterministic approximation ratios of $(0.367 - ε)$ for matroid constraints and $0.25$ for knapsack constraints.

Multi-Objective Evolutionary Optimization of Chance-Constrained Multiple-Choice Knapsack Problems with Implicit Probability Distributions 2026-03-09
Show

The multiple-choice knapsack problem (MCKP) is a classic combinatorial optimization with wide practical applications. This paper investigates a significant yet underexplored extension of MCKP: the multi-objective chance-constrained MCKP (MO-CCMCKP) under implicit probability distributions. The goal of the problem is to simultaneously minimize the total cost and maximize the confidence level of satisfying the capacity constraint, capturing essential trade-offs in domains like 5G network configuration. To address the computational challenge of evaluating chance constraints under implicit distributions, we first propose an order-preserving efficient resource allocation Monte Carlo (OPERA-MC) method. This approach adaptively allocates sampling resources to preserve dominance relationships while reducing evaluation time significantly. Further, we develop NHILS, a hybrid evolutionary algorithm that integrates specialized initialization and local search into NSGA-II to navigate sparse feasible regions. Experiments on synthetic benchmarks and real-world 5G network configuration benchmarks demonstrate that NHILS consistently outperforms several state-of-the-art multi-objective optimizers in convergence, diversity, and feasibility. The benchmark instances and source code will be made publicly available to facilitate research in this area.

Stochastic Knapsack: Semi-Adaptivity Gaps and Improved Approximation 2026-03-02
Show

In stochastic combinatorial optimization, algorithms differ in their adaptivity: whether or not they query realized randomness and adapt to it. Dean et al. (FOCS '04) formalize the adaptivity gap, which compares the performance of fully adaptive policies to that of non-adaptive ones. We revisit the fundamental Stochastic Knapsack problem of Dean et al., where items have deterministic values and independent stochastic sizes. A policy packs items sequentially, stopping at the first knapsack overflow or before. We focus on the challenging risky variant, in which an overflow forfeits all accumulated value, and study the problem through the lens of semi-adaptivity: We measure the power of $k$ adaptive queries for constant $k$ through the notions of $0$-$k$ semi-adaptivity gap (the gap between $k$-semi-adaptive and non-adaptive policies), and $k$-$n$ semi-adaptivity gap (between fully adaptive and $k$-semi-adaptive policies). Our first contribution is to improve the classic results of Dean et al. by giving tighter upper and lower bounds on the adaptivity gap. Our second contribution is a smoother interpolation between non-adaptive and fully-adaptive policies, with the rationale that when full adaptivity is unrealistic (due to its complexity or query cost), limited adaptivity may be a desirable middle ground. We quantify the $1$-$n$ and $k$-$n$ semi-adaptivity gaps, showing how well $k$ queries approximate the fully-adaptive policy. We complement these bounds by quantifying the $0$-$1$ semi-adaptivity gap, i.e., the improvement from investing in a single query over no adaptivity. As part of our analysis, we develop a 3-step "Simplify-Equalize-Optimize" approach to analyzing adaptive decision trees, with possible applications to the study of semi-adaptivity in additional stochastic combinatorial optimization problems.

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem 2026-02-23
Show

Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We propose KnapSpec, a training-free framework that reformulates draft model selection as a knapsack problem to maximize tokens-per-time throughput. By decoupling Attention and MLP layers and modeling their hardware-specific latencies as functions of context length, KnapSpec adaptively identifies optimal draft configurations on the fly via a parallel dynamic programming algorithm. Furthermore, we provide the first rigorous theoretical analysis establishing cosine similarity between hidden states as a mathematically sound proxy for the token acceptance rate. This foundation allows our method to maintain high drafting faithfulness while navigating the shifting bottlenecks of real-world hardware. Our experiments on Qwen3 and Llama3 demonstrate that KnapSpec consistently outperforms state-of-the-art SSD baselines, achieving up to 1.47x wall-clock speedup across various benchmarks. Our plug-and-play approach ensures high-speed inference for long sequences without requiring additional training or compromising the target model's output distribution.

Stochastic Knapsack with Costs: On Adaptivity and Return-on-Investment 2026-02-17
Show

We revisit the Stochastic Knapsack problem, where a policy-maker chooses an execution order for jobs with fixed values and stochastic running-times, aiming to maximize the value completed by a deadline. Dean et al. (FOCS'04) show that simple non-adaptive policies can approximate the (highly adaptive) optimum, initiating the study of adaptivity gaps. We introduce an economically motivated generalization in which each job also carries an execution cost. This uncovers new applications, most notably a new and natural variant of contract design: jobs are processed by agents who choose among effort levels that induce different processing-time distributions, and the principal must decide which jobs to run and what payments induce the intended effort. With costs, the objective becomes mixed-sign: value from completed jobs must be balanced against costs of execution, and running a job that misses the deadline can create negative utility. This changes the algorithmic picture, and the adaptivity gap is no longer constant. We give an economic explanation: the performance of non-adaptive policies is governed by jobs' return on investment (ROI) -- utility over cost -- which can be arbitrarily small. We introduce a hierarchy of increasingly adaptive policies, trading off simplicity and adaptivity. We prove near-tight guarantees across the hierarchy, showing that with costs the adaptivity gap is $Θ(α)$, where $1/α$ is the ROI. Higher in the hierarchy, we identify an efficiently computable policy with limited adaptivity that is approximately-optimal. Analogous to the centrality of ROI in economics, we believe our ROI-based, simple-vs-optimal approach to adaptivity may be useful for additional stochastic optimization and online problems with mixed-sign objectives.

ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression 2026-02-11
Show

We present ROCKET, a training-free model compression method that achieves state-of-the-art performance in comparison with factorization, structured-sparsification and dynamic compression baselines. Operating under a global compression budget, ROCKET comprises two key innovations: First, it formulates layer-wise compression allocation as a multi-choice knapsack problem, selecting the optimal compression level for each layer to minimize total reconstruction error while adhering to a target model size. Second, it introduces a single-step sparse matrix factorization inspired by dictionary learning: using only a small calibration set, it sparsifies weight coefficients based on activation-weights sensitivity and then updates the dictionary in closed form via least squares bypassing iterative optimization, sparse coding, or backpropagation entirely. ROCKET consistently outperforms existing compression approaches across different model architectures at 20-50% compression rates. Notably, it retains over 90% of the original model's performance at 30% compression without any fine-tuning. Moreover, when applying a light fine-tuning phase, recovery is substantially enhanced: for instance, compressing Qwen3-14B to an 8B-parameter model and healing it with just 30 million tokens yields performance nearly on par with the original Qwen3-8B. The code for ROCKET is at github.com/mts-ai/ROCKET/tree/main.

Dominating Set Knapsack: Profit Optimization on Dominating Sets 2026-01-29
Show

In a large-scale network, we want to choose some influential nodes to make a profit by paying some cost within a limited budget so that we do not have to spend more budget on some nodes adjacent to the chosen nodes; our problem is the graph-theoretic representation of it. We define our problem, Dominating Set Knapsack, by attaching the knapsack problem with the dominating set on graphs. Each vertex $v~(\in V) $ is associated with a cost factor $w(v)$ and a profit amount $α(v)$. We aim to choose some vertices within a fixed budget $(s)$ that give maximum profit so that we do not need to choose their 1-hop neighbors. We show that the Dominating Set Knapsack problem is strongly NPC even when restricted to bipartite graphs, but weakly NPC for star graphs. We present a pseudo-polynomial time algorithm for trees in time $O(n\cdot min{s^2, (α(V))^2})$. We show that Dominating Set Knapsack is unlikely to be Fixed Parameter Tractable (FPT) by proving that it is W[2]-hard parameterized by the solution size. We developed FPT algorithms with running time $O(4^{tw}\cdot n^{O(1)} min{s^2,{α(V)}^2})$ and $O(2^{vck-1}\cdot n^{O(1)} min{s^2,{α(V)}^2})$, where $tw$ represents the $tw$ of the given graph $G(V,E)$, $vck$ is the solution size of the Vertex Cover Knapsack, $s$ is the capacity or size of the knapsack and $α(V)=\sum_{v\in V}α(v)$. We obtained similar results for other variants $k-$Dominating Set Knapsack and Minimal Dominating Set Knapsack, where $k$ is the size of the dominating set.

Differentiable Knapsack and Top-k Operators via Dynamic Programming 2026-01-29
Show

Knapsack and Top-k operators are useful for selecting discrete subsets of variables. However, their integration into neural networks is challenging as they are piecewise constant, yielding gradients that are zero almost everywhere. In this paper, we propose a unified framework casting these operators as dynamic programs, and derive differentiable relaxations by smoothing the underlying recursions. On the algorithmic side, we develop efficient parallel algorithms supporting both deterministic and stochastic forward passes, and vector-Jacobian products for the backward pass. On the theoretical side, we prove that Shannon entropy is the unique regularization choice yielding permutation-equivariant operators, and characterize regularizers inducing sparse selections. Finally, on the experimental side, we demonstrate our framework on a decision-focused learning benchmark, a constrained dynamic assortment RL problem, and an extension of discrete VAEs.

Kad: A Framework for Proxy-based Test-time Alignment with Knapsack Approximation Deferral 2026-01-24
Show

Several previous works concluded that the largest part of generation capabilities of large language models (LLM) are learned (early) during pre-training. However, LLMs still require further alignment to adhere to downstream task requirements and stylistic preferences, among other desired properties. As LLMs continue to scale in terms of size, the computational cost of alignment procedures increase prohibitively. In this work, we propose a novel approach to circumvent these costs via proxy-based test-time alignment, i.e. using guidance from a small aligned model. Our approach can be described as a token-specific cascading method, where the token-specific deferral rule is reduced to 0-1 knapsack problem. In this setting, we derive primal and dual approximations of the optimal deferral decision. We experimentally show the benefits of our method both in task performance and speculative decoding speed.

EACL 2026 main
A Construction Framework of Coded Caching Scheme for Multi-Access MISO Systems via Knapsack Problem 2026-01-16
Show

This paper investigates the coded caching problem in a multi-access multiple-input single-output (MAMISO) network with the combinatorial topology. The considered system consists of a server containing $N$ files, $Λ$ cache nodes, and $K$ cache-less users, where each user can access a unique subset of $r$ cache nodes. The server is equipped with $L$ transmit antennas. Our objective is to design a caching scheme that simultaneously achieves a high sum Degree of Freedom (sum-DoF) and low subpacketization complexity. To address this challenge, we formulate the design of multi-antenna placement delivery arrays (MAPDA) as a $0$--$1$ knapsack problem to maximize the achievable DoF, thereby transforming the complex combinatorial caching structure into a tractable optimization framework that yields efficient cache placement and flexible delivery strategies. Theoretical and numerical analyses demonstrate that: for networks with combinatorial topologies, the proposed scheme achieves a higher sum-DoF than existing schemes. Under identical cache size constraints, the subpacketization level remains comparable to existing linear subpacketization schemes. Moreover, under specific system conditions, the proposed scheme attains the theoretical maximum sum-DoF of $\min{L+KM/N, K}$ while achieving further reductions subpacketization. For particular combinatorial structures, we further derive optimized constructions that achieve even higher sum-DoF with lower subpacketization. ```

Shadoks Approach to Knapsack Polygonal Packing 2026-01-15
Show

The 2024 edition of the CG:SHOP Challenge focused on the knapsack polygonal packing problem. Each instance consists of a convex polygon known as the container and a multiset of items, where each item is a simple polygon with an associated integer value. A feasible packing solution places a selection of the items inside the container without overlapping and using only translations. The goal is to achieve a packing that maximizes the total value of the items in the solution. Our approach to win first place is divided into two main steps. First, we generate promising initial solutions using two strategies: one based on integer linear programming and the other on employing a combination of geometric greedy heuristics. In the second step, we enhance these solutions through local search techniques, which involve repositioning items and exploring potential replacements to improve the total value of the packing.

Episodic Contextual Bandits with Knapsacks under Conversion Models 2026-01-02
Show

We study an online setting, where a decision maker (DM) interacts with contextual bandit-with-knapsack (BwK) instances in repeated episodes. These episodes start with different resource amounts, and the contexts' probability distributions are non-stationary in an episode. All episodes share the same latent conversion model, which governs the random outcome contingent upon a request's context and an allocation decision. Our model captures applications such as dynamic pricing on perishable resources with episodic replenishment, and first price auctions in repeated episodes with different starting budgets. We design an online algorithm that achieves a regret sub-linear in $T$, the number of episodes, assuming access to a \emph{confidence bound oracle} that achieves an $o(T)$-regret. Such an oracle is readily available from existing contextual bandit literature. We overcome the technical challenge with arbitrarily many possible contexts, which leads to a reinforcement learning problem with an unbounded state space. Our framework provides improved regret bounds in certain settings when the DM is provided with unlabeled feature data, which is novel to the contextual BwK literature.

An O(nlogn) approximate knapsack algorithm 2025-12-27
Show

A modified dynamic programming algorithm rapidly and accurately solves large 0/1 knapsack problems. It has computational O(nlogn), space O(nlogn) and predictable maximum error. Experimentally it's accuracy increases faster than linearly with the solution size k. Problems with k=1e3 are solved with an average maximum fractional error of 1e-4 and problems with k=1e5 with an average maximum fractional error of 1e-7. The algorithm runs in constant time for all problems with a given n. On a common desktop computer the algorithm processes n=1e3 problems in 1e-3 seconds and n=1e6 problems in 2 seconds.

8 pag...

8 pages: removed latex from metadata title and abstract to improve readability

Near-Optimal Sparsifiers for Stochastic Knapsack and Assignment Problems 2025-12-01
Show

When uncertainty meets costly information gathering, a fundamental question emerges: which data points should we probe to unlock near-optimal solutions? Sparsification of stochastic packing problems addresses this trade-off. The existing notions of sparsification measure the level of sparsity, called degree, as the ratio of queried items to the optimal solution size. While effective for matching and matroid-type problems with uniform structures, this cardinality-based approach fails for knapsack-type constraints where feasible sets exhibit dramatic structural variation. We introduce a polyhedral sparsification framework that measures the degree as the smallest scalar needed to embed the query set within a scaled feasibility polytope, naturally capturing redundancy without relying on cardinality. Our main contribution establishes that knapsack, multiple knapsack, and generalized assignment problems admit (1 - epsilon)-approximate sparsifiers with degree polynomial in 1/p and 1/epsilon -- where p denotes the independent activation probability of each element -- remarkably independent of problem dimensions. The key insight involves grouping items with similar weights and deploying a charging argument: when our query set misses an optimal item, we either substitute it with a queried item from the same group or leverage that group's excess contribution to compensate for the loss. This reveals an intriguing complexity-theoretic separation -- while the multiple knapsack problem lacks an FPTAS and generalized assignment is APX-hard, their sparsification counterparts admit efficient (1 - epsilon)-approximation algorithms that identify polynomial-degree query sets. Finally, we raise an open question: can such sparsification extend to general integer linear programs with degree independent of problem dimensions?

51 pa...

51 pages, 8 figures. Accepted to ITCS 2026

Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection 2025-11-27
Show

Designing effective agentic systems requires the seamless composition and integration of agents, tools, and models within dynamic and uncertain environments. Most existing methods rely on static, semantic retrieval approaches for tool or agent discovery. However, effective reuse and composition of existing components remain challenging due to incomplete capability descriptions and the limitations of retrieval methods. Component selection suffers because the decisions are not based on capability, cost, and real-time utility. To address these challenges, we introduce a structured, automated framework for agentic system composition that is inspired by the knapsack problem. Our framework enables a composer agent to systematically identify, select, and assemble an optimal set of agentic components by jointly considering performance, budget constraints, and compatibility. By dynamically testing candidate components and modeling their utility in real-time, our approach streamlines the assembly of agentic systems and facilitates scalable reuse of resources. Empirical evaluation with Claude 3.5 Sonnet across five benchmarking datasets shows that our online-knapsack-based composer consistently lies on the Pareto frontier, achieving higher success rates at significantly lower component costs compared to our baselines. In the single-agent setup, the online knapsack composer shows a success rate improvement of up to 31.6% in comparison to the retrieval baselines. In multi-agent systems, the online knapsack composer increases success rate from 37% to 87% when agents are selected from an agent inventory of 100+ agents. The substantial performance gap confirms the robust adaptability of our method across diverse domains and budget constraints.

Accep...

Accepted to NeurIPS 2025 Conference

Evolutionary Algorithm for Chance Constrained Quadratic Multiple Knapsack Problem 2025-11-04
Show

Quadratic multiple knapsack problem (QMKP) is a combinatorial optimisation problem characterised by multiple weight capacity constraints and a profit function that combines linear and quadratic profits. We study a stochastic variant of this problem where profits are considered as random variables. This problem reflects complex resource allocation problems in real-world scenarios where randomness is inherent. We model this problem using chance constraints to capture the stochastic profits. We propose a hybrid approach for this problem, which combines an evolutionary algorithm (EA) with a local optimisation strategy inspired by multi-factorial optimisation (MFO). EAs are used for global search due to their effectiveness in handling large, complex solution spaces. In the hybrid approach, EA periodically passes interim solutions to the local optimiser for refinement. The local optimiser applies MFO principles, which are typically used in multi-tasking problems. The local optimiser models the local problem as a multi-tasking problem by constructing disjoint search spaces for each knapsack based on an input solution. For each item, its assignment across all knapsacks is considered to determine the preferred knapsack. Items are then divided into disjoint groups corresponding to each knapsack, allowing each knapsack to be treated as a separate optimisation task. This structure enables effective application of MFO-based local refinements. We consider two EAs for the problem, (1+1) EA and ($μ+λ$) EA. We conduct experiments to explore the effectiveness of these EAs on their own and also with the proposed local optimiser. Experimental results suggest that hybrid approaches, particularly those incorporating MFO, perform well on instances where chance constraints and capacity constraints are tight.

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints 2025-11-03
Show

In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint, so that the re-solving technique is applied to a linear program with additional slack variables amenable to practical computations and theoretical analysis. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.

Message Recovery Attack in NTRU via Knapsack 2025-10-29
Show

In the present paper, we introduce a message-recovery attack based on the Modular Knapsack Problem, applicable to all variants of the NTRU-HPS cryptosystem. Assuming that a fraction $ε$ of the coefficients of the message ${\bf{m}}\in{-1,0,1}^N$ and of the nonce vector ${\bf r}\in{-1,0,1}^N$ are known in advance at random positions, we reduce message decryption to finding a short vector in a lattice that encodes an instance of a modular knapsack system. This allows us to address a key question: how much information about ${\bf m}$, or about the pair $({\bf m},{\bf r})$, is required before recovery becomes feasible? A FLATTER reduction successfully recovers the message, in practice when $ε\approx 0.45$. Our implementation finds ${\bf m}$ within a few minutes on a commodity desktop.

Robust Extensible Bin Packing and Revisiting the Convex Knapsack Problem 2025-10-27
Show

We study a robust extensible bin packing problem with budgeted uncertainty, under a budgeted uncertainty model where item sizes are defined to lie in the intersection of a box with a one-norm ball. We propose a scenario generation algorithm for this problem, which alternates between solving a master robust bin-packing problem with a finite uncertainty set and solving a separation problem. We first show that the separation is strongly NP-hard given solutions to the continuous relaxation of the master problem. Then, focusing on the separation problem for the integer master problem, we show that this problem becomes a special case of the continuous convex knapsack problem, which is known to be weakly NP-hard. Next, we prove that our special case when each of the functions is piecewise linear, having only two pieces, remains NP-hard. We develop a pseudo-polynomial dynamic program (DP) and a fully polynomial-time approximation scheme (FPTAS) for our special case whose running times match those of a binary knapsack FPTAS. Finally, our computational study shows that the DP can be significantly more efficient in practice compared with solving the problem with specially ordered set (SOS) constraints using advanced mixed-integer (MIP) solvers. Our experiments also demonstrate the application of our separation problem method to solving the robust extensible bin packing problem, including the evaluation of deferring the exact solution of the master problem, separating based on approximate master solutions in intermediate iterations. Finally, a case-study, based on real elective surgery data, demonstrates the potential advantage of our model compared with the actual schedule and optimal nominal schedules.

Solving 0-1 Integer Programs with Unknown Knapsack Constraints Using Membership Oracles 2025-10-23
Show

We consider solving a combinatorial optimization problem with unknown knapsack constraints using a membership oracle for each unknown constraint such that, given a solution, the oracle determines whether the constraint is satisfied or not with absolute certainty. The goal of the decision maker is to find the best possible solution subject to a budget on the number of oracle calls. Inspired by active learning for binary classification based on Support Vector Machines (SVMs), we devise a framework to solve the problem by learning and exploiting surrogate linear constraints. The framework includes training linear separators on the labeled points and selecting new points to be labeled, which is achieved by applying a sampling strategy and solving a 0-1 integer linear program. Following the active learning literature, a natural choice would be SVM as a linear classifier and the information-based sampling strategy known as simple margin, for each unknown constraint. We improve on both sides: we propose an alternative sampling strategy based on mixed-integer quadratic programming and a linear separation method inspired by an algorithm for convex optimization in the oracle model. We conduct experiments on classical problems and variants inspired by realistic applications to show how different linear separation methods and sampling strategies influence the quality of the results in terms of several metrics including objective value, dual bound and running time.

The Complexity of Recognizing Facets for the Knapsack Polytope 2025-10-20
Show

The complexity class DP is the class of all languages that are the intersection of a language in NP and a language in coNP. It was conjectured that recognizing a facet for the knapsack polytope is DP-complete. We provide a positive answer to this conjecture. Moreover, despite the \DP-hardness of the recognition problem, we give a polynomial time algorithm for deciding if an inequality with a fixed number of distinct coefficients defines a facet of a knapsack polytope.

An Exact Solver for Submodular Knapsack Problems 2025-10-20
Show

We study the problem of maximizing a monotone increasing submodular function over a set of weighted elements subject to a knapsack constraint. Although this problem is NP-hard, many applications require exact solutions, as approximate solutions are often insufficient in practice. To address this need, we propose an exact branch-and-bound algorithm tailored for the submodular knapsack problem and introduce several acceleration techniques to enhance its efficiency. We evaluate these techniques on artificial instances of three benchmark problems as well as on instances derived from real-world data. We compare the proposed solver with two solvers by Sakaue and Ishihata (2018), which currently achieve the strongest performance reported in the literature, as well as with a branch-and-cut algorithm implemented using Gurobi that solves a binary linear reformulation of the submodular knapsack problem, demonstrating that our methods are highly successful.

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation 2025-09-30
Show

Large Language Models (LLMs) can self-improve through reinforcement learning, where they generate trajectories to explore and discover better solutions. However, this exploration process is computationally expensive, often forcing current methods to assign limited exploration budgets to each task. This uniform allocation creates problematic edge cases: easy tasks consistently succeed while difficult tasks consistently fail, both producing zero gradients during training updates for the widely used Group Relative Policy Optimization (GRPO). We address this problem from the lens of exploration budget allocation. Viewing each task's exploration as an "item" with a distinct "value" and "cost", we establish a connection to the classical knapsack problem. This formulation allows us to derive an optimal assignment rule that adaptively distributes resources based on the model's current learning status. When applied to GRPO, our method increases the effective ratio of non-zero policy gradients by 20-40% during training. Acting as a computational "free lunch", our approach could reallocate exploration budgets from tasks where learning is saturated to those where it is most impactful. This enables significantly larger budgets (e.g., 93 rollouts) for especially challenging problems, which would be computationally prohibitive under a uniform allocation. These improvements translate to meaningful gains on mathematical reasoning benchmarks, with average improvements of 2-4 points and peak gains of 9 points on specific tasks. Notably, achieving comparable performance with traditional homogeneous allocation would require about 2x the computational resources.

Stealing From the Dragon's Hoard: Online Unbounded Knapsack With Removal 2025-09-26
Show

We introduce the Online Unbounded Knapsack Problem with Removal, a variation of the well-known Online Knapsack Problem. Items, each with a weight and value, arrive online and an algorithm must decide on whether or not to pack them into a knapsack with a fixed weight limit. An item may be packed an arbitrary number of times and items may be removed from the knapsack at any time without cost. The goal is to maximize the total value of items packed, while respecting a weight limit. We show that this is one of the very few natural online knapsack variants that allow for competitive deterministic algorithms in the general setting, by providing an algorithm with competitivity 1.6911. We complement this with a lower bound of 1.5877. We also analyze the proportional setting, where the weight and value of any single item agree, and show that deterministic algorithms can be exactly 3/2-competitive. Lastly, we give lower and upper bounds of 6/5 and 4/3 on the competitivity of randomized algorithms in this setting.

Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches 2025-09-17
Show

This paper studies the impact of limited switches on resource-constrained dynamic pricing with demand learning. We focus on the classical price-based blind network revenue management problem and extend our results to the bandits with knapsacks problem. In both settings, a decision maker faces stochastic and distributionally unknown demand, and must allocate finite initial inventory across multiple resources over time. In addition to standard resource constraints, we impose a switching constraint that limits the number of action changes over the time horizon. We establish matching upper and lower bounds on the optimal regret and develop computationally efficient limited-switch algorithms that achieve it. We show that the optimal regret rate is fully characterized by a piecewise-constant function of the switching budget, which further depends on the number of resource constraints. Our results highlight the fundamental role of resource constraints in shaping the statistical complexity of online learning under limited switches. Extensive simulations demonstrate that our algorithms maintain strong cumulative reward performance while significantly reducing the number of switches.

$(1-ε)$-Approximation of Knapsack in Nearly Quadratic Time 2025-08-10
Show

Knapsack is one of the most fundamental problems in theoretical computer science. In the $(1 - ε)$-approximation setting, although there is a fine-grained lower bound of $(n + 1 / ε) ^ {2 - o(1)}$ based on the $(\min, +)$-convolution hypothesis ([K{ü}nnemann, Paturi and Stefan Schneider, ICALP 2017] and [Cygan, Mucha, Wegrzycki and Wlodarczyk, 2017]), the best algorithm is randomized and runs in $\tilde O\left(n + (\frac{1}ε)^{11/5}/2^{Ω(\sqrt{\log(1/ε)})}\right)$ time [Deng, Jin and Mao, SODA 2023], and it remains an important open problem whether an algorithm with a running time that matches the lower bound (up to a sub-polynomial factor) exists. We answer the question positively by showing a deterministic $(1 - ε)$-approximation scheme for knapsack that runs in $\tilde O(n + (1 / ε) ^ {2})$ time. We first extend a known lemma in a recursive way to reduce the problem to $n ε$-additive approximation for $n$ items with profits in $[1, 2)$. Then we give a simple efficient geometry-based algorithm for the reduced problem.

Accep...

Accepted to STOC 2024; Revision note: expanded technical overview;

Convolution and Knapsack in Higher Dimensions 2025-08-10
Show

In the Knapsack problem, one is given the task of packing a knapsack of a given size with items in order to gain a packing with a high profit value. An important connection to the $(\max,+)$-convolution problem has been established, where knapsack solutions can be combined by building the convolution of two sequences. This observation has been used in recent years to give conditional lower bounds but also parameterized algorithms. In this paper we carry these results into higher dimensions. We consider Knapsack where items are characterized by multiple properties -- given through a vector -- and a knapsack that has a capacity vector. The packing must not exceed any of the given capacity constraints. In order to show a similar sub-quadratic lower bound we consider a multidimensional version of $(\max, +)$-convolution. We then consider variants of this problem introduced by Cygan et al. and prove that they are all equivalent in terms of algorithms that allow for a running time sub-quadratic in the number of entries of the array. We develop a parameterized algorithm to solve higher dimensional Knapsack. The techniques we apply are inspired by an algorithm introduced by Axiotis and Tzamos. We will show that even for higher dimensional Knapsack, we can reduce the problem to convolution on one-dimensional, concave sequences, leading to an $\mathcal{O}(dn + dD \cdot \max{Π_{i=1}^d{t_i}, t_{\max}\log t_{\max}})$ algorithm, where $D$ is the number of different weight vectors, $t$ the capacity vector and $d$ is the dimension of the problem. Then, we use the techniques to improve the approach of Eisenbrand and Weismantel to obtain an algorithm for Integer Linear Programming with upper bounds with running time $\mathcal{O}(dn) + D \cdot \mathcal{O}(d Δ)^{d(d+1)} + T_{\mathrm{LP}}$.

accep...

accepted at WADS 2025

Performance of the Extended Ising Machine for the Quadratic Knapsack Problem 2025-08-09
Show

The extended Ising machine (EIM) enhances conventional Ising models, which handle only binary quadratic forms by allowing constraints through real-valued dependent variables. We address the quadratic knapsack problem (QKP), hard to solve using Ising machines when formulated as a quadratic unconstrained binary optimization (QUBO). We demonstrated the EIM's superiority by comparing it with the conventional Ising model-based approach, a commercial exact solver, and a state-of-the-art heuristic solver for QKP.

6 pages
An Online Multi-dimensional Knapsack Approach for Slice Admission Control 2025-08-08
Show

Network Slicing has emerged as a powerful technique to enable cost-effective, multi-tenant communications and services over a shared physical mobile network infrastructure. One major challenge of service provisioning in slice-enabled networks is the uncertainty in the demand for the limited network resources that must be shared among existing slices and potentially new Network Slice Requests. In this paper, we consider admission control of Network Slice Requests in an online setting, with the goal of maximizing the long-term revenue received from admitted requests. We model the Slice Admission Control problem as an Online Multidimensional Knapsack Problem and present two reservation-based policies and their algorithms, which have a competitive performance for Online Multidimensional Knapsack Problems. Through Monte Carlo simulations, we evaluate the performance of our online admission control method in terms of average revenue gained by the Infrastructure Provider, system resource utilization, and the ratio of accepted slice requests. We compare our approach with those of the online First Come First Serve greedy policy. The simulation's results prove that our proposed online policies increase revenues for Infrastructure Providers by up to 12.9 % while reducing the average resource consumption by up to 1.7% In particular, when the tenants' economic inequality increases, an Infrastructure Provider who adopts our proposed online admission policies gains higher revenues compared to an Infrastructure Provider who adopts First Come First Serve.

Accep...

Accepted by 20th Consumer Communications & Networking Conference (CCNC)

High-dimensional Linear Bandits with Knapsacks 2025-08-02
Show

We investigate the contextual bandits with knapsack (CBwK) problem in a high-dimensional linear setting, where the feature dimension can be very large. Our goal is to harness sparsity to obtain sharper regret guarantees. To this end, we first develop an online variant of the hard thresholding algorithm that performs the sparse estimation in an online manner. We then embed this estimator in a primal-dual scheme: every knapsack constraint is paired with a dual variable, which is updated by an online learning rule to keep the cumulative resource consumption within budget. This integrated approach achieves a two-phase sub-linear regret that scales only logarithmically with the feature dimension, improving on the polynomial dependency reported in prior work. Furthermore, we show that either of the following structural assumptions is sufficient for a sharper regret bound of $\tilde{O}(s_{0} \sqrt{T})$: (i) a diverse-covariate condition; and (ii) a margin condition. When both conditions hold simultaneously, we can further control the regret to $O(s_{0}^{2} \log(dT)\log T)$ by a dual resolving scheme. As a by-product, applying our framework to high-dimensional contextual bandits without knapsack constraints recovers the optimal regret rates in both the data-poor and data-rich regimes. Finally, numerical experiments confirm the empirical efficiency of our algorithms in high-dimensional settings.

Efficient Branch-and-Bound for Submodular Function Maximization under Knapsack Constraint 2025-07-15
Show

The submodular knapsack problem (SKP), which seeks to maximize a submodular set function by selecting a subset of elements within a given budget, is an important discrete optimization problem. The majority of existing approaches to solving the SKP are approximation algorithms. However, in domains such as health-care facility location and risk management, the need for optimal solutions is still critical, necessitating the use of exact algorithms over approximation methods. In this paper, we present an optimal branch-and-bound approach, featuring a novel upper bound with a worst-case tightness guarantee and an efficient dual branching method to minimize repeat computations. Experiments in applications such as facility location, weighted coverage, influence maximization, and so on show that the algorithms that implement the new ideas are far more efficient than conventional methods.

Accep...

Accepted to ECAI 2025

Near-Optimal Consistency-Robustness Trade-Offs for Learning-Augmented Online Knapsack Problems 2025-07-09
Show

This paper introduces a family of learning-augmented algorithms for online knapsack problems that achieve near Pareto-optimal consistency-robustness trade-offs through a simple combination of trusted learning-augmented and worst-case algorithms. Our approach relies on succinct, practical predictions -- single values or intervals estimating the minimum value of any item in an offline solution. Additionally, we propose a novel fractional-to-integral conversion procedure, offering new insights for online algorithm design.

31 pa...

31 pages, 16 figures, Accepted at ICML 2025

Quantum Algorithms for Bandits with Knapsacks with Improved Regret and Time Complexities 2025-07-06
Show

Bandits with knapsacks (BwK) constitute a fundamental model that combines aspects of stochastic integer programming with online learning. Classical algorithms for BwK with a time horizon $T$ achieve a problem-independent regret bound of ${O}(\sqrt{T})$ and a problem-dependent bound of ${O}(\log T)$. In this paper, we initiate the study of the BwK model in the setting of quantum computing, where both reward and resource consumption can be accessed via quantum oracles. We establish both problem-independent and problem-dependent regret bounds for quantum BwK algorithms. For the problem-independent case, we demonstrate that a quantum approach can improve the classical regret bound by a factor of $(1+\sqrt{B/\mathrm{OPT}\mathrm{LP}})$, where $B$ is budget constraint in BwK and $\mathrm{OPT}{\mathrm{LP}}$ denotes the optimal value of a linear programming relaxation of the BwK problem. For the problem-dependent setting, we develop a quantum algorithm using an inexact quantum linear programming solver. This algorithm achieves a quadratic improvement in terms of the problem-dependent parameters, as well as a polynomial speedup of time complexity on problem's dimensions compared to classical counterparts. Compared to previous works on quantum algorithms for multi-armed bandits, our study is the first to consider bandit models with resource constraints and hence shed light on operations research.

33 pages
On the Complexity of Knapsack under Explorable Uncertainty: Hardness and Algorithms 2025-07-03
Show

In the knapsack problem under explorable uncertainty, we are given a knapsack instance with uncertain item profits. Instead of having access to the precise profits, we are only given uncertainty intervals that are guaranteed to contain the corresponding profits. The actual item profit can be obtained via a query. The goal of the problem is to adaptively query item profits until the revealed information suffices to compute an optimal (or approximate) solution to the underlying knapsack instance. Since queries are costly, the objective is to minimize the number of queries. In the offline variant of this problem, we assume knowledge of the precise profits and the task is to compute a query set of minimum cardinality that a third party without access to the profits could use to identify an optimal (or approximate) knapsack solution. We show that this offline variant is complete for the second-level of the polynomial hierarchy, i.e., $Σ_2^p$-complete, and cannot be approximated within a non-trivial factor unless $Σ_2^p = Δ_2^p$. Motivated by these strong hardness results, we consider a resource-augmented variant of the problem where the requirements on the query set computed by an algorithm are less strict than the requirements on the optimal solution we compare against. More precisely, a query set computed by the algorithm must reveal sufficient information to identify an approximate knapsack solution, while the optimal query set we compare against has to reveal sufficient information to identify an optimal solution. We show that this resource-augmented setting allows interesting non-trivial algorithmic results.

Unbounded knapsack problem and double partitions 2025-06-30
Show

The unbounded knapsack problem can be considered as a particular case of the double partition problem that asks for a number of nonnegative integer solutions to a system of two linear Diophantine equations with integer coefficients. In the middle of 19th century Sylvester and Cayley suggested an approach based on the variable elimination allowing a reduction of a double partition to a sum of scalar partitions. This manuscript discusses a geometric interpretation of this method and its application to the knapsack problem.

6 pages, 1 figure
Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation 2025-06-20
Show

Generating SQLs from user queries is a long-standing challenge, where the accuracy of initial schema linking significantly impacts subsequent SQL generation performance. However, current schema linking models still struggle with missing relevant schema elements or an excess of redundant ones. A crucial reason for this is that commonly used metrics, recall and precision, fail to capture relevant element missing and thus cannot reflect actual schema linking performance. Motivated by this, we propose enhanced schema linking metrics by introducing a restricted missing indicator. Accordingly, we introduce Knapsack optimization-based Schema Linking Approach (KaSLA), a plug-in schema linking method designed to prevent the missing of relevant schema elements while minimizing the inclusion of redundant ones. KaSLA employs a hierarchical linking strategy that first identifies the optimal table linking and subsequently links columns within the selected table to reduce linking candidate space. In each linking process, it utilizes a knapsack optimization approach to link potentially relevant elements while accounting for a limited tolerance of potentially redundant ones. With this optimization, KaSLA-1.6B achieves superior schema linking results compared to large-scale LLMs, including deepseek-v3 with the state-of-the-art (SOTA) schema linking method. Extensive experiments on Spider and BIRD benchmarks verify that KaSLA can significantly improve the SQL generation performance of SOTA Text2SQL models by substituting their schema linking processes.

Energy Efficient Knapsack Optimization Using Probabilistic Memristor Crossbars 2025-06-17
Show

Constrained optimization underlies crucial societal problems (for instance, stock trading and bandwidth allocation), but is often computationally hard (complexity grows exponentially with problem size). The big-data era urgently demands low-latency and low-energy optimization at the edge, which cannot be handled by digital processors due to their non-parallel von Neumann architecture. Recent efforts using massively parallel hardware (such as memristor crossbars and quantum processors) employing annealing algorithms, while promising, have handled relatively easy and stable problems with sparse or binary representations (such as the max-cut or traveling salesman problems).However, most real-world applications embody three features, which are encoded in the knapsack problem, and cannot be handled by annealing algorithms - dense and non-binary representations, with destabilizing self-feedback. Here we demonstrate a post-digital-hardware-friendly randomized competitive Ising-inspired (RaCI) algorithm performing knapsack optimization, experimentally implemented on a foundry-manufactured CMOS-integrated probabilistic analog memristor crossbar. Our solution outperforms digital and quantum approaches by over 4 orders of magnitude in energy efficiency.

13 pages, 6 figures
Knapsack and Shortest Path Problems Generalizations From A Quantum-Inspired Tensor Network Perspective 2025-06-13
Show

In this paper, we present two tensor network quantum-inspired algorithms to solve the knapsack and the shortest path problems, and enables to solve some of its variations. These methods provide an exact equation which returns the optimal solution of the problems. As in other tensor network algorithms for combinatorial optimization problems, the method is based on imaginary time evolution and the implementation of restrictions in the tensor network. In addition, we introduce the use of symmetries and the reutilization of intermediate calculations, reducing the computational complexity for both problems. To show the efficiency of our implementations, we carry out some performance experiments and compare the results with those obtained by other classical algorithms.

14 pa...

14 pages, 14 figures, extended version of the presented and published at the 1st International Conference on Quantum Software (IQSOFT)

An extension of Dembo-Hammer's reduction algorithm for the 0-1 knapsack problem 2025-06-06
Show

Dembo-Hammer's Reduction Algorithm (DHR) is one of the classical algorithms for the 0-1 Knapsack Problem (0-1 KP) and its variants, which reduces an instance of the 0-1 KP to a sub-instance of smaller size with reduction time complexity $O(n)$. We present an extension of DHR (abbreviated as EDHR), which reduces an instance of 0-1 KP to at most $n^i$ sub-instances for any positive integer $i$. In practice, $i$ can be set as needed. In particular, if we choose $i=1$ then EDHR is exactly DHR. Finally, computational experiments on randomly generated data instances demonstrate that EDHR substantially reduces the search tree size compared to CPLEX.

Fair Submodular Maximization over a Knapsack Constraint 2025-05-17
Show

We consider fairness in submodular maximization subject to a knapsack constraint, a fundamental problem with various applications in economics, machine learning, and data mining. In the model, we are given a set of ground elements, each associated with a weight and a color, and a monotone submodular function defined over them. The goal is to maximize the submodular function while guaranteeing that the total weight does not exceed a specified budget (the knapsack constraint) and that the number of elements selected for each color falls within a designated range (the fairness constraint). While there exists some recent literature on this topic, the existence of a non-trivial approximation for the problem -- without relaxing either the knapsack or fairness constraints -- remains a challenging open question. This paper makes progress in this direction. We demonstrate that when the number of colors is constant, there exists a polynomial-time algorithm that achieves a constant approximation with high probability. Additionally, we show that if either the knapsack or fairness constraint is relaxed only to require expected satisfaction, a tight approximation ratio of $(1-1/e-ε)$ can be obtained in expectation for any $ε>0$.

To ap...

To appear in IJCAI 2025

Online Knapsack Problems with Estimates 2025-04-30
Show

Imagine you are a computer scientist who enjoys attending conferences or workshops within the year. Sadly, your travel budget is limited, so you must select a subset of events you can travel to. When you are aware of all possible events and their costs at the beginning of the year, you can select the subset of the possible events that maximizes your happiness and is within your budget. On the other hand, if you are blind about the options, you will likely have a hard time when trying to decide if you want to register somewhere or not, and will likely regret decisions you made in the future. These scenarios can be modeled by knapsack variants, either by an offline or an online problem. However, both scenarios are somewhat unrealistic: Usually, you will not know the exact costs of each workshop at the beginning of the year. The online version, however, is too pessimistic, as you might already know which options there are and how much they cost roughly. At some point, you have to decide whether to register for some workshop, but then you are aware of the conference fee and the flight and hotel prices. We model this problem within the setting of online knapsack problems with estimates: in the beginning, you receive a list of potential items with their estimated size as well as the accuracy of the estimates. Then, the items are revealed one by one in an online fashion with their actual size, and you need to decide whether to take one or not. In this article, we show a best-possible algorithm for each estimate accuracy $δ$ (i.e., when each actual item size can deviate by $\pm δ$ from the announced size) for both the simple knapsack and the simple knapsack with removability.

20 pages, 2 figures
Online General Knapsack with Reservation Costs 2025-04-29
Show

In the online general knapsack problem, an algorithm is presented with an item $x=(s,v)$ of size $s$ and value $v$ and must irrevocably choose to pack such an item into the knapsack or reject it before the next item appears. The goal is to maximize the total value of the packed items without overflowing the knapsack's capacity. As this classical setting is way too harsh for many real-life applications, we will analyze the online general knapsack problem under the reservation model. Here, instead of accepting or rejecting an item immediately, an algorithm can delay the decision of whether to pack the item by paying a fraction $0\le α$ of the size or the value of the item. This models many practical applications, where, for example, decisions can be delayed for some costs e.g. cancellation fees. We present results for both variants: First, for costs depending on the size of the items and then for costs depending on the value of the items. If the reservation costs depend on the size of the items, we find a matching upper and lower bound of $2$ for every $α$. On the other hand, if the reservation costs depend on the value of the items, we find that no algorithm is competitive for reservation costs larger than $1/2$ of the item value, and we find upper and lower bounds for the rest of the reservation range $0\leα< 1/2$.

14 pages
Knapsack on Graphs with Relaxed Neighborhood Constraints 2025-04-24
Show

In the knapsack problems with neighborhood constraints that were studied before, the input is a graph $\mathcal{G}$ on a set $\mathcal{V}$ of items, each item $v \in \mathcal{V}$ has a weight $w_v$ and profit $p_v$, the size $s$ of the knapsack, and the demand $d$. The goal is to compute if there exists a feasible solution whose total weight is at most $s$ and total profit is at most $d$. Here, feasible solutions are all subsets $\mathcal{S}$ of the items such that, for every item in $\mathcal{S}$, at least one of its neighbors in $\mathcal{G}$ is also in $\mathcal{S}$ for \hor, and all its neighbors in $\mathcal{G}$ are also in $\mathcal{S}$ for \hand~\cite{borradaile2012knapsack}. We study a relaxation of the above problems. Specifically, we allow all possible subsets of items to be feasible solutions. However, only those items for which we pick at least one or all of its neighbor (out-neighbor for directed graph) contribute to profit whereas every item picked contribute to the weight; we call the corresponding problems \sor and \sand. We show that both \sor and \sand are strongly \NPC even on undirected graphs. Regarding parameterized complexity, we show both \sor and \hor are \WTH parameterized by the size $s$ of the knapsack size. Interestingly, both \sand and \hand are \WOH parameterized by knapsack size, $s$ plus profit demand, $d$ and also parameterized by solution size, $b$. For \sor and \hor, we present a randomized color-coding-based pseudo-\FPT algorithm, parameterized by the solution size $b$, and consequently by the demand $d$. We then consider the treewidth of the input graph as our parameter and design pseudo fixed-parameter tractable (\FPT) algorithm parameterized by treewidth, $\text{tw}$ for all variants. Finally, we present an additive $1$ approximation for \sor when both the weight and profit of every vertex is $1$.

Adversarial Knapsack for Sequential Competitive Resource Allocation 2025-04-23
Show

This work addresses competitive resource allocation in a sequential setting, where two players allocate resources across objects or locations of shared interest. Departing from the simultaneous Colonel Blotto game, our framework introduces a sequential decision-making dynamic, where players act with partial or complete knowledge of previous moves. Unlike traditional approaches that rely on complex mixed strategies, we focus on deterministic pure strategies, streamlining computation while preserving strategic depth. Additionally, we extend the payoff structure to accommodate fractional allocations and payoffs, moving beyond the binary, all-or-nothing paradigm to allow more granular outcomes. We model this problem as an adversarial knapsack game, formulating it as a bilevel optimization problem that integrates the leader's objective with the follower's best-response. This knapsack-based approach is novel in the context of competitive resource allocation, with prior work only partially leveraging it for follower analysis. Our contributions include: (1) proposing an adversarial knapsack formulation for the sequential resource allocation problem, (2) developing efficient heuristics for fractional allocation scenarios, and (3) analyzing the 0-1 knapsack case, providing a computational hardness result alongside a heuristic solution.

8 pages, 7 figures
Weakly Approximating Knapsack in Subquadratic Time 2025-04-22
Show

We consider the classic Knapsack problem. Let $t$ and $\mathrm{OPT}$ be the capacity and the optimal value, respectively. If one seeks a solution with total profit at least $\mathrm{OPT}/(1 + \varepsilon)$ and total weight at most $t$, then Knapsack can be solved in $\tilde{O}(n + (\frac{1}{\varepsilon})^2)$ time [Chen, Lian, Mao, and Zhang '24][Mao '24]. This running time is the best possible (up to a logarithmic factor), assuming that $(\min,+)$-convolution cannot be solved in truly subquadratic time [Künnemann, Paturi, and Schneider '17][Cygan, Mucha, Węgrzycki, and Włodarczyk '19]. The same upper and lower bounds hold if one seeks a solution with total profit at least $\mathrm{OPT}$ and total weight at most $(1 + \varepsilon)t$. Therefore, it is natural to ask the following question. If one seeks a solution with total profit at least $\mathrm{OPT}/(1+\varepsilon)$ and total weight at most $(1 + \varepsilon)t$, can Knsapck be solved in $\tilde{O}(n + (\frac{1}{\varepsilon})^{2-δ})$ time for some constant $δ> 0$? We answer this open question affirmatively by proposing an $\tilde{O}(n + (\frac{1}{\varepsilon})^{7/4})$-time algorithm.

To ap...

To appear in ICALP2025

The Competitive Ratio of Threshold Policies for Online Unit-density Knapsack Problems 2025-04-07
Show

We study a wholesale supply chain ordering problem. In this problem, the supplier has an initial stock, and faces an unpredictable stream of incoming orders, making real-time decisions on whether to accept or reject each order. What makes this wholesale supply chain ordering problem special is its ``knapsack constraint,'' that is, we do not allow partially accepting an order or splitting an order. The objective is to maximize the utilized stock. We model this wholesale supply chain ordering problem as an online unit-density knapsack problem. We study randomized threshold algorithms that accept an item as long as its size exceeds the threshold. We derive two optimal threshold distributions, the first is 0.4324-competitive relative to the optimal offline integral packing, and the second is 0.4285-competitive relative to the optimal offline fractional packing. Both results require optimizing the cumulative distribution function of the random threshold, which are challenging infinite-dimensional optimization problems. We also consider the generalization to multiple knapsacks, where an arriving item has a different size in each knapsack. We derive a 0.2142-competitive algorithm for this problem. We also show that any randomized algorithm for this problem cannot be more than 0.4605-competitive. This is the first upper bound strictly less than 0.5, which implies the intrinsic challenge of knapsack constraint. We show how to naturally implement our optimal threshold distributions in the warehouses of a Latin American chain department store. We run simulations on their order data, which demonstrate the efficacy of our proposed algorithms.

Local Computation Algorithms for Knapsack: impossibility results, and how to avoid them 2025-04-02
Show

Local Computation Algorithms (LCA), as introduced by Rubinfeld, Tamir, Vardi, and Xie (2011), are a type of ultra-efficient algorithms which, given access to a (large) input for a given computational task, are required to provide fast query access to a consistent output solution, without maintaining a state between queries. This paradigm of computation in particular allows for hugely distributed algorithms, where independent instances of a given LCA provide consistent access to a common output solution. The past decade has seen a significant amount of work on LCAs, by and large focusing on graph problems. In this paper, we initiate the study of Local Computation Algorithms for perhaps the archetypal combinatorial optimization problem, Knapsack. We first establish strong impossibility results, ruling out the existence of any non-trivial LCA for Knapsack as several of its relaxations. We then show how equipping the LCA with additional access to the Knapsack instance, namely, weighted item sampling, allows one to circumvent these impossibility results, and obtain sublinear-time and query LCAs. Our positive result draws on a connection to the recent notion of reproducibility for learning algorithms (Impagliazzo, Lei, Pitassi, and Sorrell, 2022), a connection we believe to be of independent interest for the design of LCAs.

Generalized Assignment and Knapsack Problems in the Random-Order Model 2025-04-02
Show

We study different online optimization problems in the random-order model. There is a finite set of bins with known capacity and a finite set of items arriving in a random order. Upon arrival of an item, its size and its value for each of the bins is revealed and it has to be decided immediately and irrevocably to which bin the item is assigned, or to not assign the item at all. In this setting, an algorithm is $α$-competitive if the total value of all items assigned to the bins is at least an $α$-fraction of the total value of an optimal assignment that knows all items beforehand. We give an algorithm that is $α$-competitive with $α= (1-\ln(2))/2 \approx 1/6.52$ improving upon the previous best algorithm with $α\approx 1/6.99$ for the generalized assignment problem and the previous best algorithm with $α\approx 1/6.65$ for the integral knapsack problem. We then study the fractional knapsack problem where we have a single bin and it is also allowed to pack items fractionally. For that case, we obtain an algorithm that is $α$-competitive with $α= 1/e \approx 1/2.71$ improving on the previous best algorithm with $α= 1/4.39$. We further show that this competitive ratio is the best-possible for deterministic algorithms in this model.

Introduction to QUDO, Tensor QUDO and HOBO formulations: Qudits, Equivalences, Knapsack Problem, Traveling Salesman Problem and Combinatorial Games 2025-03-31
Show

In this paper, we present a brief review and introduction to Quadratic Unconstrained D-ary Optimization (QUDO), Tensor Quadratic Unconstrained D-ary Optimization (T-QUDO) and Higher-Order Unconstrained Binary Optimization (HOBO) formulations for combinatorial optimization problems. We also show their equivalences. To help their understanding, we make some examples for the knapsack problem, traveling salesman problem and different combinatorial games. The games chosen to exemplify are: Hashiwokakero, N-Queens, Kakuro, Inshi no heya, and Peg Solitaire. Although some of these games have already been formulated in a QUBO formulation, we are going to approach them with more general formulations, allowing their execution in new quantum or quantum-inspired optimization algorithms. This can be an easier way to introduce these more complicated formulations for harder problems.

18 pages, 5 figures
Improved Approximation Algorithms for Three-Dimensional Knapsack 2025-03-25
Show

We study the three-dimensional Knapsack (3DK) problem, in which we are given a set of axis-aligned cuboids with associated profits and an axis-aligned cube knapsack. The objective is to find a non-overlapping axis-aligned packing (by translation) of the maximum profit subset of cuboids into the cube. The previous best approximation algorithm is due to Diedrich, Harren, Jansen, Thöle, and Thomas (2008), who gave a $(7+\varepsilon)$-approximation algorithm for 3DK and a $(5+\varepsilon)$-approximation algorithm for the variant when the items can be rotated by 90 degrees around any axis, for any constant $\varepsilon>0$. Chlebík and Chlebíková (2009) showed that the problem does not admit an asymptotic polynomial-time approximation scheme. We provide an improved polynomial-time $(139/29+\varepsilon) \approx 4.794$-approximation algorithm for 3DK and $(30/7+\varepsilon) \approx 4.286$-approximation when rotations by 90 degrees are allowed. We also provide improved approximation algorithms for several variants such as the cardinality case (when all items have the same profit) and uniform profit-density case (when the profit of an item is equal to its volume). Our key technical contribution is container packing -- a structured packing in 3D such that all items are assigned into a constant number of containers, and each container is packed using a specific strategy based on its type. We first show the existence of highly profitable container packings. Thereafter, we show that one can find near-optimal container packing efficiently using a variant of the Generalized Assignment Problem (GAP).

Constrained Bandwidth Observation Sharing for Multi-Robot Navigation in Dynamic Environments via Intelligent Knapsack 2025-03-03
Show

Multi-robot navigation is increasingly crucial in various domains, including disaster response, autonomous vehicles, and warehouse and manufacturing automation. Robot teams often must operate in highly dynamic environments and under strict bandwidth constraints imposed by communication infrastructure, rendering effective observation sharing within the system a challenging problem. This paper presents a novel optimal communication scheme, Intelligent Knapsack (iKnap), for multi-robot navigation in dynamic environments under bandwidth constraints. We model multi-robot communication as belief propagation in a graph of inferential agents. We then formulate the combinatorial optimization for observation sharing as a 0/1 knapsack problem, where each potential pairwise communication between robots is assigned a decision-making utility to be weighed against its bandwidth cost, and the system has some cumulative bandwidth limit. We evaluate our approach in a simulated robotic warehouse with human workers using ROS2 and the Open Robotics Middleware Framework. Compared to state-of-the-art broadcast-based optimal communication schemes, iKnap yields significant improvements in navigation performance with respect to scenario complexity while maintaining a similar runtime. Furthermore, iKnap utilizes allocated bandwidth and observational resources more efficiently than existing approaches, especially in very low-resource and high-uncertainty settings. Based on these results, we claim that the proposed method enables more robust collaboration for multi-robot teams in real-world navigation problems.

Sum-Of-Squares To Approximate Knapsack 2025-02-18
Show

These notes give a self-contained exposition of Karlin, Mathieu and Nguyen's tight estimate of the integrality gap of the sum-of-squares semidefinite program for solving the knapsack problem. They are based on a sequence of three lectures in CMU course on Advanced Approximation Algorithms in Fall'21 that used the KMN result to introduce the Sum-of-Squares method for algorithm design. The treatment in these notes uses the pseudo-distribution view of solutions to the sum-of-squares SDPs and only rely on a few basic, reusable results about pseudo-distributions.

Generative-enhanced optimization for knapsack problems: an industry-relevant study 2025-02-07
Show

Optimization is a crucial task in various industries such as logistics, aviation, manufacturing, chemical, pharmaceutical, and insurance, where finding the best solution to a problem can result in significant cost savings and increased efficiency. Tensor networks (TNs) have gained prominence in recent years in modeling classical systems with quantum-inspired approaches. More recently, TN generative-enhanced optimization (TN-GEO) has been proposed as a strategy which uses generative modeling to efficiently sample valid solutions with respect to certain constraints of optimization problems. Moreover, it has been shown that symmetric TNs (STNs) can encode certain constraints of optimization problems, thus aiding in their solution process. In this work, we investigate the applicability of TN- and STN-GEO to an industry relevant problem class, a multi-knapsack problem, in which each object must be assigned to an available knapsack. We detail a prescription for practitioners to use the TN-and STN-GEO methodology and study its scaling behavior and dependence on its hyper-parameters. We benchmark 60 different problem instances and find that TN-GEO and STN-GEO produce results of similar quality to simulated annealing.

Bandits with Anytime Knapsacks 2025-01-30
Show

We consider bandits with anytime knapsacks (BwAK), a novel version of the BwK problem where there is an \textit{anytime} cost constraint instead of a total cost budget. This problem setting introduces additional complexities as it mandates adherence to the constraint throughout the decision-making process. We propose SUAK, an algorithm that utilizes upper confidence bounds to identify the optimal mixture of arms while maintaining a balance between exploration and exploitation. SUAK is an adaptive algorithm that strategically utilizes the available budget in each round in the decision-making process and skips a round when it is possible to violate the anytime cost constraint. In particular, SUAK slightly under-utilizes the available cost budget to reduce the need for skipping rounds. We show that SUAK attains the same problem-dependent regret upper bound of $ O(K \log T)$ established in prior work under the simpler BwK framework. Finally, we provide simulations to verify the utility of SUAK in practical settings.

A Nearly Quadratic-Time FPTAS for Knapsack 2025-01-07
Show

We investigate the classic Knapsack problem and propose a fully polynomial-time approximation scheme (FPTAS) that runs in $\widetilde{O}(n + (1/\varepsilon)^2)$ time. This improves upon the $\widetilde{O}(n + (1/\varepsilon)^{11/5})$-time algorithm by Deng, Jin, and Mao [\textit{Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms, 2023}]. Our algorithm is the best possible (up to a polylogarithmic factor) conditioned on the conjecture that $(\min, +)$-convolution has no truly subquadratic-time algorithm, since this conjecture implies that Knapsack has no $O((n + 1/\varepsilon)^{2-δ})$-time FPTAS for any constant $δ> 0$.

Hybrid Firefly-Genetic Algorithm for Single and Multi-dimensional 0-1 Knapsack Problems 2024-12-31
Show

This paper addresses the challenges faced by algorithms, such as the Firefly Algorithm (FA) and the Genetic Algorithm (GA), in constrained optimization problems. While both algorithms perform well for unconstrained problems, their effectiveness diminishes when constraints are introduced due to limitations in exploration, exploitation, and constraint handling. To overcome these challenges, a hybrid FAGA algorithm is proposed, combining the strengths of both algorithms. The hybrid algorithm is validated by solving unconstrained benchmark functions and constrained optimization problems, including design engineering problems and combinatorial problems such as the 0-1 Knapsack Problem. The proposed algorithm delivers improved solution accuracy and computational efficiency compared to conventional optimization algorithm. This paper outlines the development and structure of the hybrid algorithm and demonstrates its effectiveness in handling complex optimization problems.

Approximation Schemes for Geometric Knapsack for Packing Spheres and Fat Objects 2024-12-23
Show

We study the geometric knapsack problem in which we are given a set of $d$-dimensional objects (each with associated profits) and the goal is to find the maximum profit subset that can be packed non-overlappingly into a given $d$-dimensional (unit hypercube) knapsack. Even if $d=2$ and all input objects are disks, this problem is known to be \textsf{NP}-hard [Demaine, Fekete, Lang, 2010]. In this paper, we give polynomial time $(1+\varepsilon)$-approximation algorithms for the following types of input objects in any constant dimension $d$: - disks and hyperspheres, - a class of fat convex polygons that generalizes regular $k$-gons for $k\ge 5$ (formally, polygons with a constant number of edges, whose lengths are in a bounded range, and in which each angle is strictly larger than $π/2$), - arbitrary fat convex objects that are sufficiently small compared to the knapsack. We remark that in our \textsf{PTAS} for disks and hyperspheres, we output the computed set of objects, but for a $O_\varepsilon(1)$ of them, we determine their coordinates only up to an exponentially small error. However, it is unclear whether there always exists a $(1+\varepsilon)$-approximate solution that uses only rational coordinates for the disks' centers. We leave this as an open problem that is related to well-studied geometric questions in the realm of circle packing.

A pre...

A preliminary version of the work appeared in the proceedings of the 51st EATCS International Colloquium on Automata, Languages, and Programming (ICALP) 2024

The complexity of knapsack problems in wreath products 2024-11-30
Show

We prove new complexity results for computational problems in certain wreath products of groups and (as an application) for free solvable group. For a finitely generated group we study the so-called power word problem (does a given expression $u_1^{k_1} \ldots u_d^{k_d}$, where $u_1, \ldots, u_d$ are words over the group generators and $k_1, \ldots, k_d$ are binary encoded integers, evaluate to the group identity?) and knapsack problem (does a given equation $u_1^{x_1} \ldots u_d^{x_d} = v$, where $u_1, \ldots, u_d,v$ are words over the group generators and $x_1,\ldots,x_d$ are variables, has a solution in the natural numbers). We prove that the power word problem for wreath products of the form $G \wr \mathbb{Z}$ with $G$ nilpotent and iterated wreath products of free abelian groups belongs to $\mathsf{TC}^0$. As an application of the latter, the power word problem for free solvable groups is in $\mathsf{TC}^0$. On the other hand we show that for wreath products $G \wr \mathbb{Z}$, where $G$ is a so called uniformly strongly efficiently non-solvable group (which form a large subclass of non-solvable groups), the power word problem is $\mathsf{coNP}$-hard. For the knapsack problem we show $\mathsf{NP}$-completeness for iterated wreath products of free abelian groups and hence free solvable groups. Moreover, the knapsack problem for every wreath product $G \wr \mathbb{Z}$, where $G$ is uniformly efficiently non-solvable, is $Σ^2_p$-hard.

Approximation Algorithms for Correlated Knapsack Orienteering 2024-11-29
Show

We consider the {\em correlated knapsack orienteering} (CSKO) problem: we are given a travel budget $B$, processing-time budget $W$, finite metric space $(V,d)$ with root $ρ\in V$, where each vertex is associated with a job with possibly correlated random size and random reward that become known only when the job completes. Random variables are independent across different vertices. The goal is to compute a $ρ$-rooted path of length at most $B$, in a possibly adaptive fashion, that maximizes the reward collected from jobs that are processed by time $W$. To our knowledge, CSKO has not been considered before, though prior work has considered the uncorrelated problem, {\em stochastic knapsack orienteering}, and {\em correlated orienteering}, which features only one budget constraint on the {\em sum} of travel-time and processing-times. We show that the {\em adaptivity gap of CSKO is not a constant, and is at least $Ω\bigl(\max\sqrt{\log{B}},\sqrt{\log\log{W}}}\bigr)$}. Complementing this, we devise {\em non-adaptive} algorithms that obtain: (a) $O(\log\log W)$-approximation in quasi-polytime; and (b) $O(\log W)$-approximation in polytime. We obtain similar guarantees for CSKO with cancellations, wherein a job can be cancelled before its completion time, foregoing its reward. We also consider the special case of CSKO, wherein job sizes are weighted Bernoulli distributions, and more generally where the distributions are supported on at most two points (2-CSKO). Although weighted Bernoulli distributions suffice to yield an $Ω(\sqrt{\log\log B})$ adaptivity-gap lower bound for (uncorrelated) {\em stochastic orienteering}, we show that they are easy instances for CSKO. We develop non-adaptive algorithms that achieve $O(1)$-approximation in polytime for weighted Bernoulli distributions, and in $(n+\log B)^{O(\log W)}$-time for the more general case of 2-CSKO.

Full ...

Full version of APPROX 2024 paper

A quantum algorithm for solving 0-1 Knapsack problems 2024-11-19
Show

Here we present two novel contributions for achieving quantum advantage in solving difficult optimisation problems, both in theory and foreseeable practice. (1) We introduce the "Quantum Tree Generator", an approach to generate in superposition all feasible solutions of a given instance, yielding together with amplitude amplification the optimal solutions for 0-1 knapsack problems. The QTG offers massive memory savings and enables competitive runtimes compared to the classical state-of-the-art knapsack solvers (such as COMBO, Gurobi, CP-SAT, Greedy) already for instances involving as few as 100 variables. (2) By introducing a new runtime calculation technique that exploits logging data from the classical solver COMBO, we can predict the runtime of our method way beyond the range of existing quantum platforms and simulators, for various benchmark instances with up to 600 variables. Combining both of these innovations, we demonstrate the QTG's potential practical quantum advantage for large-scale problems, indicating an effective approach for combinatorial optimisation problems.

6+13 ...

6+13 pages, 11 figures

Online Unbounded Knapsack 2024-10-31
Show

We analyze the competitive ratio and the advice complexity of the online unbounded knapsack problem. An instance is given as a sequence of n items with a size and a value each, and an algorithm has to decide how often to pack each item into a knapsack of bounded capacity. The items are given online and the total size of the packed items must not exceed the knapsack's capacity, while the objective is to maximize the total value of the packed items. While each item can only be packed once in the classical 0-1 knapsack problem, the unbounded version allows for items to be packed multiple times. We show that the simple unbounded knapsack problem, where the size of each item is equal to its value, allows for a competitive ratio of 2. We also analyze randomized algorithms and show that, in contrast to the 0-1 knapsack problem, one uniformly random bit cannot improve an algorithm's performance. More randomness lowers the competitive ratio to less than 1.736, but it can never be below 1.693. In the advice complexity setting, we measure how many bits of information the algorithm has to know to achieve some desired solution quality. For the simple unbounded knapsack problem, one advice bit lowers the competitive ratio to 3/2. While this cannot be improved with fewer than log(n) advice bits for instances of length n, a competitive ratio of 1+epsilon can be achieved with O(log(n/epsilon)/epsilon) advice bits for any epsilon>0. We further show that no amount of advice bounded by a function f(n) allows an algorithm to be optimal. We also study the online general unbounded knapsack problem and show that it does not allow for any bounded competitive ratio for deterministic and randomized algorithms, as well as for algorithms using fewer than log(n) advice bits. We also provide an algorithm that uses O(log(n/epsilon)/epsilon) advice bits to achieve a competitive ratio of 1+epsilon for any epsilon>0.

Approximately Counting Knapsack Solutions in Subquadratic Time 2024-10-29
Show

We revisit the classic #Knapsack problem, which asks to count the Boolean points $(x_1,\dots,x_n)\in{0,1}^n$ in a given half-space $\sum_{i=1}^nW_ix_i\le T$. This #P-complete problem admits $(1\pmε)$-approximation. Before this work, [Dyer, STOC 2003]'s $\tilde{O}(n^{2.5}+n^2{ε^{-2}})$-time randomized approximation scheme remains the fastest known in the natural regime of $ε\ge 1/polylog(n)$. In this paper, we give a randomized $(1\pmε)$-approximation algorithm in $\tilde{O}(n^{1.5}{ε^{-2}})$ time (in the standard word-RAM model), achieving the first sub-quadratic dependence on $n$. Such sub-quadratic running time is rare in the approximate counting literature in general, as a large class of algorithms naturally faces a quadratic-time barrier. Our algorithm follows Dyer's framework, which reduces #Knapsack to the task of sampling (and approximately counting) solutions in a randomly rounded instance with poly(n)-bounded integer weights. We refine Dyer's framework using the following ideas: - We decrease the sample complexity of Dyer's Monte Carlo method, by proving some structural lemmas for typical points near the input hyperplane via hitting-set arguments, and appropriately setting the rounding scale. - Instead of running a vanilla dynamic program on the rounded instance, we employ techniques from the growing field of pseudopolynomial-time Subset Sum algorithms, such as FFT, divide-and-conquer, and balls-into-bins hashing of [Bringmann, SODA 2017]. We also need other ingredients, including a surprising application of the recent Bounded Monotone (max,+)-Convolution algorithm by [Chi-Duan-Xie-Zhang, STOC 2022] (adapted by [Bringmann-Dürr-Polak, ESA 2024]), the notion of sum-approximation from [Gawrychowski-Markin-Weimann, ICALP 2018]'s #Knapsack approximation scheme, and a two-phase extension of Dyer's framework for handling tiny weights.

To ap...

To appear at SODA 2025

Maximizing a Submodular Function with Bounded Curvature under an Unknown Knapsack Constraint 2024-10-24
Show

This paper studies the problem of maximizing a monotone submodular function under an unknown knapsack constraint. A solution to this problem is a policy that decides which item to pack next based on the past packing history. The robustness factor of a policy is the worst case ratio of the solution obtained by following the policy and an optimal solution that knows the knapsack capacity. We develop a policy with a robustness factor that is decreasing in the curvature $c$ of the submodular function. For the extreme cases $c=0$ corresponding to an additive objective function, it matches a previously known and best possible robustness factor of $1/2$. For the other extreme case of $c=1$ it yields a robustness factor of $\approx 0.35$ improving over the best previously known robustness factor of $\approx 0.06$. The analysis of our policy relies on a greedy algorithm that is a slight modification of Wolsey's greedy algorithm for the submodular knapsack problem with a known knapsack constraint. We obtain tight approximation guarantees for both of these algorithms in the setting of a submodular objective function with curvature $c$.

Packing a Knapsack with Items Owned by Strategic Agents 2024-10-08
Show

This paper considers a scenario within the field of mechanism design without money where a mechanism designer is interested in selecting items with maximum total value under a knapsack constraint. The items, however, are controlled by strategic agents who aim to maximize the total value of their items in the knapsack. This is a natural setting, e.g., when agencies select projects for funding, companies select products for sale in their shops, or hospitals schedule MRI scans for the day. A mechanism governing the packing of the knapsack is strategyproof if no agent can benefit from hiding items controlled by them to the mechanism. We are interested in mechanisms that are strategyproof and $α$-approximate in the sense that they always approximate the maximum value of the knapsack by a factor of $α\in [0,1]$. First, we give a deterministic mechanism that is $\frac{1}{3}$-approximate. For the special case where all items have unit density, we design a $\frac{1}φ$-approximate mechanism where $1/φ\approx 0.618$ is the inverse of the golden ratio. This result is tight as we show that no deterministic strategyproof mechanism with a better approximation exists. We further give randomized mechanisms with approximation guarantees of $1/2$ for the general case and $2/3$ for the case of unit densities. For both cases, no strategyproof mechanism can achieve an approximation guarantee better than $1/(5φ-7)\approx 0.917$.

Knapsack with Vertex Cover, Set Cover, and Hitting Set 2024-10-05
Show

Given an undirected graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$, with vertex weights $(w(u)){u\in\mathcal{V}}$, vertex values $(α(u)){u\in\mathcal{V}}$, a knapsack size $s$, and a target value $d$, the \vcknapsack problem is to determine if there exists a subset $\mathcal{U}\subseteq\mathcal{V}$ of vertices such that $\mathcal{U}$ forms a vertex cover, $w(\mathcal{U})=\sum_{u\in\mathcal{U}} w(u) \le s$, and $α(\mathcal{U})=\sum_{u\in\mathcal{U}} α(u) \ge d$. In this paper, we closely study the \vcknapsack problem and its variations, such as \vcknapsackbudget, \minimalvcknapsack, and \minimumvcknapsack, for both general graphs and trees. We first prove that the \vcknapsack problem belongs to the complexity class \NPC and then study the complexity of the other variations. We generalize the problem to \setc and \hs versions and design polynomial time $H_g$-factor approximation algorithm for the \setckp problem and d-factor approximation algorithm for \hstp using primal dual method. We further show that \setcks and \hsmb are hard to approximate in polynomial time. Additionally, we develop a fixed parameter tractable algorithm running in time $8^{\mathcal{O}({\rm tw})}\cdot n\cdot {\sf min}{s,d}$ where ${\rm tw},s,d,n$ are respectively treewidth of the graph, the size of the knapsack, the target value of the knapsack, and the number of items for the \minimalvcknapsack problem.

Improved Parallel Algorithm for Non-Monotone Submodular Maximization under Knapsack Constraint 2024-09-06
Show

This work proposes an efficient parallel algorithm for non-monotone submodular maximization under a knapsack constraint problem over the ground set of size $n$. Our algorithm improves the best approximation factor of the existing parallel one from $8+ε$ to $7+ε$ with $O(\log n)$ adaptive complexity. The key idea of our approach is to create a new alternate threshold algorithmic framework. This strategy alternately constructs two disjoint candidate solutions within a constant number of sequence rounds. Then, the algorithm boosts solution quality without sacrificing the adaptive complexity. Extensive experimental studies on three applications, Revenue Maximization, Image Summarization, and Maximum Weighted Cut, show that our algorithm not only significantly increases solution quality but also requires comparative adaptivity to state-of-the-art algorithms.

In Pr...

In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI), Main Track

Bounding the Price-of-Fair-Sharing using Knapsack-Cover Constraints to guide Near-Optimal Cost-Recovery Algorithms 2024-08-29
Show

We consider the problem of fairly allocating the cost of providing a service among a set of users, where the service cost is formulated by an NP-hard {\it covering integer program (CIP)}. The central issue is to determine a cost allocation to each user that, in total, recovers as much as possible of the actual cost while satisfying a stabilizing condition known as the {\it core property}. The ratio between the total service cost and the cost recovered from users has been studied previously, with seminal papers of Deng, Ibaraki, & Nagomochi and Goemans & Skutella linking this {\it price-of-fair-sharing} to the integrality gap of an associated LP relaxation. Motivated by an application of cost allocation for network design for LPWANs, an emerging IoT technology, we investigate a general class of CIPs and give the first non-trivial price-of-fair-sharing bounds by using the natural LP relaxation strengthened with knapsack-cover inequalities. Furthermore, we demonstrate that these LP-based methods outperform previously known methods on an LPWAN-derived CIP data set. We also obtain analogous results for a more general setting in which the service provider also gets to select the subset of users, and the mechanism to elicit users' private utilities should be group-strategyproof. The key to obtaining this result is a simplified and improved analysis for a cross-monotone cost-allocation mechanism.

Exten...

Extended version of paper appearing in proceedings of WAOA 2024

Fine Grained Lower Bounds for Multidimensional Knapsack 2024-07-14
Show

We study the $d$-dimensional knapsack problem. We are given a set of items, each with a $d$-dimensional cost vector and a profit, along with a $d$-dimensional budget vector. The goal is to select a set of items that do not exceed the budget in all dimensions and maximize the total profit. A PTAS with running time $n^{Θ(d/\varepsilon)}$ has long been known for this problem, where $\varepsilon$ is the error parameter and $n$ is the encoding size. Despite decades of active research, the best running time of a PTAS has remained $O(n^{\lceil d/\varepsilon \rceil - d})$. Unfortunately, existing lower bounds only cover the special case with two dimensions $d = 2$, and do not answer whether there is a $n^{o(d/\varepsilon)}$-time PTAS for larger values of $d$. The status of exact algorithms is similar: there is a simple $O(n \cdot W^d)$-time (exact) dynamic programming algorithm, where $W$ is the maximum budget, but there is no lower bound which explains the strong exponential dependence on $d$. In this work, we show that the running times of the best-known PTAS and exact algorithm cannot be improved up to a polylogarithmic factor assuming Gap-ETH. Our techniques are based on a robust reduction from 2-CSP, which embeds 2-CSP constraints into a desired number of dimensions, exhibiting tight trade-off between $d$ and $\varepsilon$ for most regimes of the parameters. Informally, we obtain the following main results for $d$-dimensional knapsack. No $n^{o(d/\varepsilon \cdot 1/(\log(d/\varepsilon))^2)}$-time $(1-\varepsilon)$-approximation for every $\varepsilon = O(1/\log d)$. No $(n+W)^{o(d/\log d)}$-time exact algorithm (assuming ETH). No $n^{o(\sqrt{d})}$-time $(1-\varepsilon)$-approximation for constant $\varepsilon$. $(d \cdot \log W)^{O(d^2)} + n^{O(1)}$-time $Ω(1/\sqrt{d})$-approximation and a matching $n^{O(1)}$-time lower~bound.

Toward Practical Benchmarks of Ising Machines: A Case Study on the Quadratic Knapsack Problem 2024-07-14
Show

Combinatorial optimization has wide applications from industry to natural science. Ising machines bring an emerging computing paradigm for efficiently solving a combinatorial optimization problem by searching a ground state of a given Ising model. Current cutting-edge Ising machines achieve fast sampling of near-optimal solutions of the max-cut problem. However, for problems with additional constraint conditions, their advantages have been hardly shown due to difficulties in handling the constraints. In this work, we focus on benchmarks of Ising machines on the quadratic knapsack problem (QKP). To bring out their practical performance, we propose fast two-stage post-processing for Ising machines, which makes handling the constraint easier. Simulation based on simulated annealing shows that the proposed method substantially improves the solving performance of Ising machines and the improvement is robust to a choice of encoding of the constraint condition. Through evaluation using an Ising machine called Amplify Annealing Engine, the proposed method is shown to dramatically improve its solving performance on the QKP. These results are a crucial step toward showing advantages of Ising machines on practical problems involving various constraint conditions.

26 pages
Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size 2024-07-11
Show

The development of a satisfying and rigorous mathematical understanding of the performance of neural networks is a major challenge in artificial intelligence. Against this background, we study the expressive power of neural networks through the example of the classical NP-hard Knapsack Problem. Our main contribution is a class of recurrent neural networks (RNNs) with rectified linear units that are iteratively applied to each item of a Knapsack instance and thereby compute optimal or provably good solution values. We show that an RNN of depth four and width depending quadratically on the profit of an optimum Knapsack solution is sufficient to find optimum Knapsack solutions. We also prove the following tradeoff between the size of an RNN and the quality of the computed Knapsack solution: for Knapsack instances consisting of $n$ items, an RNN of depth five and width $w$ computes a solution of value at least $1-\mathcal{O}(n^2/\sqrt{w})$ times the optimum solution value. Our results build upon a classical dynamic programming formulation of the Knapsack Problem as well as a careful rounding of profit values that are also at the core of the well-known fully polynomial-time approximation scheme for the Knapsack Problem. A carefully conducted computational study qualitatively supports our theoretical size bounds. Finally, we point out that our results can be generalized to many other combinatorial optimization problems that admit dynamic programming solution methods, such as various Shortest Path Problems, the Longest Common Subsequence Problem, and the Traveling Salesperson Problem.

Autho...

Authors' accepted manuscript for the INFORMS Journal on Computing. A short version of this paper appeared in the proceedings of AAAI 2021

Even Faster Knapsack via Rectangular Monotone Min-Plus Convolution and Balancing 2024-07-01
Show

We present a pseudopolynomial-time algorithm for the Knapsack problem that has running time $\widetilde{O}(n + t\sqrt{p_{\max}})$, where $n$ is the number of items, $t$ is the knapsack capacity, and $p_{\max}$ is the maximum item profit. This improves over the $\widetilde{O}(n + t , p_{\max})$-time algorithm based on the convolution and prediction technique by Bateni et al.(STOC 2018). Moreover, we give some evidence, based on a strengthening of the Min-Plus Convolution Hypothesis, that our running time might be optimal. Our algorithm uses two new technical tools, which might be of independent interest. First, we generalize the $\widetilde{O}(n^{1.5})$-time algorithm for bounded monotone min-plus convolution by Chi et al.(STOC 2022) to the \emph{rectangular} case where the range of entries can be different from the sequence length. Second, we give a reduction from general knapsack instances to \emph{balanced} instances, where all items have nearly the same profit-to-weight ratio, up to a constant factor. Using these techniques, we can also obtain algorithms that run in time $\widetilde{O}(n + OPT\sqrt{w_{\max}})$, $\widetilde{O}(n + (nw_{\max}p_{\max})^{1/3}t^{2/3})$, and $\widetilde{O}(n + (nw_{\max}p_{\max})^{1/3} OPT^{2/3})$, where $OPT$ is the optimal total profit and $w_{\max}$ is the maximum item weight.

An EPTAS for Cardinality Constrained Multiple Knapsack via Iterative Randomized Rounding 2024-06-10
Show

In [Math. Oper. Res., 2011], Fleischer et al. introduced a powerful technique for solving the generic class of separable assignment problems (SAP), in which a set of items of given values and weights needs to be packed into a set of bins subject to separable assignment constraints, so as to maximize the total value. The approach of Fleischer at al. relies on solving a configuration LP and sampling a configuration for each bin independently based on the LP solution. While there is a SAP variant for which this approach yields the best possible approximation ratio, for various special cases, there are discrepancies between the approximation ratios obtained using the above approach and the state-of-the-art approximations. This raises the following natural question: Can we do better by iteratively solving the configuration LP and sampling a few bins at a time? To assess the potential gain from iterative randomized rounding, we consider as a case study one interesting SAP variant, namely, Uniform Cardinality Constrained Multiple Knapsack, for which we answer this question affirmatively. The input is a set of items, each has a value and a weight, and a set of uniform capacity bins. The goal is to assign a subset of the items of maximum total value to the bins such that $(i)$ the capacity of any bin is not exceeded, and $(ii)$ the number of items assigned to each bin satisfies a given cardinality constraint. While the technique of Fleischer et al. yields a $\left(1-\frac{1}{e}\right)$-approximation for the problem, we show that iterative randomized rounding leads to an efficient polynomial time approximation scheme (EPTAS), thus essentially resolving the complexity status of the problem. Our analysis of iterative randomized rounding can be useful for solving other SAP variants.

Finding and Exploring Promising Search Space for the 0-1 Multidimensional Knapsack Problem 2024-05-27
Show

The 0-1 Multidimensional Knapsack Problem (MKP) is a classical NP-hard combinatorial optimization problem with many engineering applications. In this paper, we propose a novel algorithm combining evolutionary computation with the exact algorithm to solve the 0-1 MKP. It maintains a set of solutions and utilizes the information from the population to extract good partial assignments. To find high-quality solutions, an exact algorithm is applied to explore the promising search space specified by the good partial assignments. The new solutions are used to update the population. Thus, the good partial assignments evolve towards a better direction with the improvement of the population. Extensive experimentation with commonly used benchmark sets shows that our algorithm outperforms the state-of-the-art heuristic algorithms, TPTEA and DQPSO, as well as the commercial solver CPlex. It finds better solutions than the existing algorithms and provides new lower bounds for 10 large and hard instances.

Randomized heuristic repair for large-scale multidimensional knapsack problem 2024-05-24
Show

The multidimensional knapsack problem (MKP) is an NP-hard combinatorial optimization problem whose solution is determining a subset of maximum total profit items that do not violate capacity constraints. Due to its hardness, large-scale MKP instances are usually a target for metaheuristics, a context in which effective feasibility maintenance strategies are crucial. In 1998, Chu and Beasley proposed an effective heuristic repair that is still relevant for recent metaheuristics. However, due to its deterministic nature, the diversity of solutions such heuristic provides is insufficient for long runs. As a result, the search for new solutions ceases after a while. This paper proposes an efficiency-based randomization strategy for the heuristic repair that increases the variability of the repaired solutions without deteriorating quality and improves the overall results.

Minimum Cut

Title Date Abstract Comment
Faster Pseudo-Deterministic Minimum Cut 2026-02-22
Show

Pseudo-deterministic algorithms are randomized algorithms that, with high constant probability, output a fixed canonical solution. The study of pseudo-deterministic algorithms for the global minimum cut problem was recently initiated by Agarwala and Varma [ITCS'26], who gave a black-box reduction incurring an $O(\log n \log \log n)$ overhead. We introduce a natural graph-theoretic tie-breaking mechanism that uniquely selects a canonical minimum cut. Using this mechanism, we obtain: (i) A pseudo-deterministic minimum cut algorithm for weighted graphs running in $O(m\log^2 n)$ time, eliminating the $O(\log n \log \log n)$ overhead of prior work and matching existing randomized algorithms. (ii) The first pseudo-deterministic algorithm for maintaining a canonical minimum cut in a fully-dynamic unweighted graph, with $\mathrm{polylog}(n)$ update time and $\tilde{O}(n)$ query time. (iii) Improved pseudo-deterministic algorithms for unweighted graphs in the dynamic streaming and cut-query models of computation, matching the best randomized algorithms.

Corrected references
Combinatorial Optimization using Comparison Oracles 2026-02-20
Show

In linear combinatorial optimization, we aim to find $S^* = \arg\min_{S \in \mathcal{F}} \langle w,\mathbf{1}_S \rangle$ for a family $\mathcal{F} \subseteq 2^U$ over a ground set $U$ of $n$ elements. Traditionally, $w$ is known or accessible via a value oracle. Motivated by practical applications involving pairwise preferences, we study the weaker and more robust comparison oracle, which for any $S, T \in \mathcal{F}$ reveals only if $w(S) <, =, \text{ or } > w(T)$. We investigate the query complexity and computational efficiency of optimizing in this model. We present three main contributions. (1) Query Complexity: We establish that the query complexity over any arbitrary set system $\mathcal{F} \subseteq 2^U$ is $\tilde{O}(n^2)$. This demonstrates a fundamental separation between information and computational complexity, as the runtime may still be exponential for NP-hard problems. (2) Algorithmic Frameworks: We develop two general tools. First, a Dual Ellipsoid framework establishes an efficient reduction from optimization to certification. It shows that to optimize efficiently, it suffices to efficiently certify a candidate's optimality using only comparisons. Second, Global Subspace Learning (GSL) sorts all feasible sets using $O(nB \log(nB))$ queries for integer weights bounded by $B$. We efficiently implement GSL for linear matroids, yielding improved query complexities for problems like $k$-SUM, SUBSET-SUM, and $A+B$ sorting. (3) Combinatorial Applications: We give the first polynomial-time, low-query algorithms for classic problems, including minimum cuts, minimum weight spanning trees (and matroid bases), bipartite matching (and matroid intersection), and shortest $s$-$t$ paths. Our work provides the first general query complexity bounds and efficient algorithmic results for this fundamental model.

Compact Conformal Subgraphs 2026-02-07
Show

Conformal prediction provides rigorous, distribution-free uncertainty guarantees, but often yields prohibitively large prediction sets in structured domains such as routing, planning, or sequential recommendation. We introduce "graph-based conformal compression", a framework for constructing compact subgraphs that preserve statistical validity while reducing structural complexity. We formulate compression as selecting a smallest subgraph capturing a prescribed fraction of the probability mass, and reduce to a weighted version of densest $k$-subgraphs in hypergraphs, in the regime where the subgraph has a large fraction of edges. We design efficient approximation algorithms that achieve constant factor coverage and size trade-offs. Crucially, we prove that our relaxation satisfies a monotonicity property, derived from a connection to parametric minimum cuts, which guarantees the nestedness required for valid conformal guarantees. Our results on the one hand bridge efficient conformal prediction with combinatorial graph compression via monotonicity, to provide rigorous guarantees on both statistical validity, and compression or size. On the other hand, they also highlight an algorithmic regime, distinct from classical densest-$k$-subgraph hardness settings, where the problem can be approximated efficiently. We finally validate our algorithmic approach via simulations for trip planning and navigation, and compare to natural baselines.

Atomic Information Flow: A Network Flow Model for Tool Attributions in RAG Systems 2026-02-04
Show

Many tool-based Retrieval Augmented Generation (RAG) systems lack precise mechanisms for tracing final responses back to specific tool components -- a critical gap as systems scale to complex multi-agent architectures. We present \textbf{Atomic Information Flow (AIF)}, a graph-based network flow model that decomposes tool outputs and LLM calls into atoms: indivisible, self-contained units of information. By modeling LLM orchestration as a directed flow of atoms from tool and LLM nodes to a response super-sink, AIF enables granular attribution metrics for AI explainability. Motivated by the max-flow min-cut theorem in network flow theory, we train a lightweight Gemma3 (4B parameter) language model as a context compressor to approximate the minimum cut of tool atoms using flow signals computed offline by AIF. We note that the base Gemma3-4B model struggles to identify critical information with \textbf{54.7%} accuracy on HotpotQA, barely outperforming lexical baselines (BM25). However, post-training on AIF signals boosts accuracy to \textbf{82.71%} (+28.01 points) while achieving \textbf{87.52%} (+1.85%) context token compression -- bridging the gap with the Gemma3-27B variant, a model nearly $7\times$ larger.

Dynamic Hierarchical $j$-Tree Decomposition and Its Applications 2026-01-14
Show

We develop a new algorithmic framework for designing approximation algorithms for cut-based optimization problems on capacitated undirected graphs that undergo edge insertions and deletions. Specifically, our framework dynamically maintains a variant of the hierarchical $j$-tree decomposition of [Madry FOCS'10], achieving a poly-logarithmic approximation factor to the graph's cut structure and supporting edge updates in $O(n^ε)$ amortized update time, for any arbitrarily small constant $ε\in (0,1)$. Consequently, we obtain new trade-offs between approximation and update/query time for fundamental cut-based optimization problems in the fully dynamic setting, including all-pairs minimum cuts, sparsest cut, multi-way cut, and multi-cut. For the last three problems, these trade-offs give the first fully-dynamic algorithms achieving poly-logarithmic approximation in sub-linear time per operation. The main technical ingredient behind our dynamic hierarchy is a dynamic cut-sparsifier algorithm that can handle vertex splits with low recourse. This is achieved by white-boxing the dynamic cut sparsifier construction of [Abraham et al. FOCS'16], based on forest packing, together with new structural insights about the maintenance of these forests under vertex splits. Given the versatility of cut sparsification in both the static and dynamic graph algorithms literature, we believe this construction may be of independent interest.

SODA 2026
Pseudodeterministic Algorithms for Minimum Cut Problems 2025-12-29
Show

In this paper, we present efficient pseudodeterministic algorithms for both the global minimum cut and minimum s-t cut problems. The running time of our algorithm for the global minimum cut problem is asymptotically better than the fastest sequential deterministic global minimum cut algorithm (Henzinger, Li, Rao, Wang; SODA 2024). Furthermore, we implement our algorithm in sequential, streaming, PRAM, and cut-query models, where no efficient deterministic global minimum cut algorithms are known.

Accep...

Accepted to ITCS 2026

Approximating Directed Minimum Cut and Arborescence Packing via Directed Expander Hierarchies 2025-12-17
Show

We give almost-linear-time algorithms for approximating rooted minimum cut and maximum arborescence packing in directed graphs, two problems that are dual to each other [Edm73]. More specifically, for an $n$-vertex, $m$-edge directed graph $G$ whose $s$-rooted minimum cut value is $k$, our first algorithm computes an $s$-rooted cut of size at most $O(k\log^{5} n)$ in $m^{1+o(1)}$ time, and our second algorithm packs $k$ $s$-rooted arborescences with $n^{o(1)}$ congestion in $m^{1+o(1)}$ time, certifying that the $s$-rooted minimum cut is at least $k / n^{o(1)}$. Our first algorithm also works for weighted graphs. Prior to our work, the fastest algorithms for computing the $s$-rooted minimum cut were exact but had super-linear running time: either $\tilde{O}(mk)$ [Gab91] or $\tilde{O}(m^{1+o(1)}\min{\sqrt{n},n/m^{1/3}})$ [CLN+22]. The fastest known algorithms for packing $s$-rooted arborescences had no congestion, but required $\tilde{O}(m \cdot \mathrm{poly}(k))$ time [BHKP08].

Almost-Optimal Approximation Algorithms for Global Minimum Cut in Directed Graphs 2025-12-16
Show

We develop new $(1+ε)$-approximation algorithms for finding the global minimum edge-cut in a directed edge-weighted graph, and for finding the global minimum vertex-cut in a directed vertex-weighted graph. Our algorithms are randomized, and have a running time of $O\left(m^{1+o(1)}/ε\right)$ on any $m$-edge $n$-vertex input graph, assuming all edge/vertex weights are polynomially-bounded. In particular, for any constant $ε>0$, our algorithms have an almost-optimal running time of $O\left(m^{1+o(1)}\right)$. The fastest previously-known running time for this setting, due to (Cen et al., FOCS 2021), is $\tilde{O}\left(\min\left{n^2/ε^2,m^{1+o(1)}\sqrt{n}\right}\right)$ for Minimum Edge-Cut, and $\tilde{O}\left(n^2/ε^2\right)$ for Minimum Vertex-Cut. Our results further extend to the rooted variants of the Minimum Edge-Cut and Minimum Vertex-Cut problems, where the algorithm is additionally given a root vertex $r$, and the goal is to find a minimum-weight cut separating any vertex from the root $r$. In terms of techniques, we build upon and extend a framework that was recently introduced by (Chuzhoy et al., SODA 2026) for solving the Minimum Vertex-Cut problem in unweighted directed graphs. Additionally, in order to obtain our result for the Global Minimum Vertex-Cut problem, we develop a novel black-box reduction from this problem to its rooted variant. Prior to our work, such reductions were only known for more restricted settings, such as when all vertex-weights are unit.

40 pa...

40 pages. Submitted to STOC 2026

Deterministic and Exact Fully-dynamic Minimum Cut of Superpolylogarithmic Size in Subpolynomial Time 2025-12-15
Show

We present an exact fully-dynamic minimum cut algorithm that runs in $n^{o(1)}$ deterministic update time when the minimum cut size is at most $2^{Θ(\log^{3/4-c}n)}$ for any $c>0$, improving on the previous algorithm of Jin, Sun, and Thorup (SODA 2024) whose minimum cut size limit is $(\log n)^{o(1)}$. Combined with graph sparsification, we obtain the first $(1+ε)$-approximate fully-dynamic minimum cut algorithm on weighted graphs, for any $ε\ge2^{-Θ(\log^{3/4-c}n)}$, in $n^{o(1)}$ randomized update time. Our main technical contribution is a deterministic local minimum cut algorithm, which replaces the randomized LocalKCut procedure from El-Hayek, Henzinger, and Li (SODA 2025).

To ap...

To appear at SODA 2026

Maximum-Flow and Minimum-Cut Sensitivity Oracles for Directed Graphs 2025-11-28
Show

Given a digraph $G = (V, E)$ with a designated source $s$, sink $t$, and an $(s,t)$-max-flow of value $λ$, we present constructions for max-flow and min-cut sensitivity oracles, and introduce the concept of a fault-tolerant flow family, which may be of independent interest. Our main contributions are as follows. 1. Fault-Tolerant Flow Family: For any graph $G$ with $(s,t)$-max-flow value $λ$, we construct a family $B$ of $2λ+1$ $(s,t)$-flows such that for every edge $e$, $B$ contains an $(s,t)$-max-flow of $G-e$. 2. Max-Flow Sensitivity Oracle: We construct a single as well as dual-edge sensitivity oracle for $(s,t)$-max-flow that requires only $O(λn)$ space. Given any set $F$ of up to two failing edges, the oracle reports the updated max-flow value in $G-F$ in $O(n)$ time. Additionally, for the single-failure case, the oracle can determine in constant time whether the flow through an edge $x$ changes when another edge $e$ fails. 3. Min-Cut Sensitivity Oracle for Dual Failures: Recently, Baswana et al. (ICALP'22) designed an $O(n^2)$-sized oracle for answering $(s,t)$-min-cut size queries under dual edge failures in constant time. We extend this by focusing on graphs with small min-cut values $λ$, and present a more compact oracle of size $O(λn)$ that answers such min-cut size queries in constant time and reports the corresponding $(s,t)$-min-cut partition in $O(n)$ time. 4. Min-Cut Sensitivity Oracle for Multiple Failures: We extend our results to the general case of $k$ edge failures. For any graph with $(s,t)$-min-cut of size $λ$, we construct a $k$-fault-tolerant min-cut oracle with space complexity $O_{λ,k}(n \log n)$ that answers min-cut size queries in $O_{λ,k}(\log n)$ time.

Faster All-Pairs Minimum Cut: Bypassing Exact Max-Flow 2025-11-13
Show

All-Pairs Minimum Cut (APMC) is a fundamental graph problem that asks to find a minimum $s,t$-cut for every pair of vertices $s,t$. A recent line of work on fast algorithms for APMC has culminated with a reduction of APMC to $\mathrm{polylog}(n)$-many max-flow computations. But unfortunately, no fast algorithms are currently known for exact max-flow in several standard models of computation, such as the cut-query model and the fully-dynamic model. Our main technical contribution is a sparsifier that preserves all minimum $s,t$-cuts in an unweighted graph, and can be constructed using only approximate max-flow computations. We then use this sparsifier to devise new algorithms for APMC in unweighted graphs in several computational models: (i) a randomized algorithm that makes $\tilde{O}(n^{3/2})$ cut queries to the input graph; (ii) a deterministic fully-dynamic algorithm with $n^{3/2+o(1)}$ worst-case update time; and (iii) a randomized two-pass streaming algorithm with space requirement $\tilde{O}(n^{3/2})$. These results improve over the known bounds, even for (single pair) minimum $s,t$-cut in the respective models.

An Effective Flow-based Method for Positive-Unlabeled Learning: 2-HNC 2025-11-02
Show

In many scenarios of binary classification, only positive instances are provided in the training data, leaving the rest of the data unlabeled. This setup, known as positive-unlabeled (PU) learning, is addressed here with a network flow-based method which utilizes pairwise similarities between samples. The method we propose here, 2-HNC, leverages Hochbaum's Normalized Cut (HNC) and the set of solutions it provides by solving a parametric minimum cut problem. The set of solutions, that are nested partitions of the samples into two sets, correspond to varying tradeoff values between the two goals: high intra-similarity inside the sets and low inter-similarity between the two sets. This nested sequence is utilized here to deliver a ranking of unlabeled samples by their likelihood of being negative. Building on this insight, our method, 2-HNC, proceeds in two stages. The first stage generates this ranking without assuming any negative labels, using a problem formulation that is constrained only on positive labeled samples. The second stage augments the positive set with likely-negative samples and recomputes the classification. The final label prediction selects among all generated partitions in both stages, the one that delivers a positive class proportion, closest to a prior estimate of this quantity, which is assumed to be given. Extensive experiments across synthetic and real datasets show that 2-HNC yields strong performance and often surpasses existing state-of-the-art algorithms.

All-Pairs Minimum Cut using $\tilde{O}(n^{7/4})$ Cut Queries 2025-10-19
Show

We present the first non-trivial algorithm for the all-pairs minimum cut problem in the cut-query model. Given cut-query access to an unweighted graph $G=(V,E)$ with $n$ vertices, our randomized algorithm constructs a Gomory-Hu tree of $G$, and thus solves the all-pairs minimum cut problem, using $\tilde{O}(n^{7/4})$ cut queries.

Differentially Private Algorithms for Graphs Under Continual Observation 2025-09-23
Show

Differentially private algorithms protect individuals in data analysis scenarios by ensuring that there is only a weak correlation between the existence of the user in the data and the result of the analysis. Dynamic graph algorithms maintain the solution to a problem (e.g., a matching) on an evolving input, i.e., a graph where nodes or edges are inserted or deleted over time. They output the value of the solution after each update operation, i.e., continuously. We study (event-level and user-level) differentially private algorithms for graph problems under continual observation, i.e., differentially private dynamic graph algorithms. We present event-level private algorithms for partially dynamic counting-based problems such as triangle count that improve the additive error by a polynomial factor (in the length $T$ of the update sequence) on the state of the art, resulting in the first algorithms with additive error polylogarithmic in $T$. We also give $\varepsilon$-differentially private and partially dynamic algorithms for minimum spanning tree, minimum cut, densest subgraph, and maximum matching. The additive error of our improved MST algorithm is $O(W \log^{3/2}T / \varepsilon)$, where $W$ is the maximum weight of any edge, which, as we show, is tight up to a $(\sqrt{\log T} / \varepsilon)$-factor. For the other problems, we present a partially-dynamic algorithm with multiplicative error $(1+β)$ for any constant $β> 0$ and additive error $O(W \log(nW) \log(T) / (\varepsilon β))$. Finally, we show that the additive error for a broad class of dynamic graph algorithms with user-level privacy must be linear in the value of the output solution's range.

Corre...

Corrected typos in lower bounds in Table 1. Fixed missing factor $\ell$ in statement of Theorem 45

Cut-Query Algorithms with Few Rounds 2025-09-17
Show

In the cut-query model, the algorithm can access the input graph $G=(V,E)$ only via cut queries that report, given a set $S\subseteq V$, the total weight of edges crossing the cut between $S$ and $V\setminus S$. This model was introduced by Rubinstein, Schramm and Weinberg [ITCS'18] and its investigation has so far focused on the number of queries needed to solve optimization problems, such as global minimum cut. We turn attention to the round complexity of cut-query algorithms, and show that several classical problems can be solved in this model with only a constant number of rounds. Our main results are algorithms for finding a minimum cut in a graph, that offer different tradeoffs between round complexity and query complexity, where $n=

V
Efficient Contractions of Dynamic Graphs -- with Applications 2025-09-05
Show

A non-trivial minimum cut (NMC) sparsifier is a multigraph $\hat{G}$ that preserves all non-trivial minimum cuts of a given undirected graph $G$. We introduce a flexible data structure for fully dynamic graphs that can efficiently provide an NMC sparsifier upon request at any point during the sequence of updates. We employ simple dynamic forest data structures to achieve a fast from-scratch construction of the sparsifier at query time. Based on the strength of the adversary and desired type of time bounds, the data structure comes with different guarantees. Specifically, let $G$ be a fully dynamic simple graph with $n$ vertices and minimum degree $δ$. Then our data structure supports an insertion/deletion of an edge to/from $G$ in $n^{o(1)}$ worst-case time. Furthermore, upon request, it can return w.h.p. an NMC sparsifier of $G$ that has $O(n/δ)$ vertices and $O(n)$ edges, in $\hat{O}(n)$ time. The probabilistic guarantees hold against an adaptive adversary. Alternatively, the update and query times can be improved to $\tilde{O}(1)$ and $\tilde{O}(n)$ respectively, if amortized-time guarantees are sufficient, or if the adversary is oblivious. We discuss two applications of our data structure. First, it can be used to efficiently report a cactus representation of all minimum cuts of a fully dynamic simple graph. Using the NMC sparsifier we can w.h.p. build this cactus in worst-case time $\hat{O}(n)$ against an adaptive adversary. Second, our data structure allows us to efficiently compute the maximal $k$-edge-connected subgraphs of undirected simple graphs, by repeatedly applying a minimum cut algorithm on the NMC sparsifier. Specifically, we can compute w.h.p. the maximal $k$-edge-connected subgraphs of a simple graph with $n$ vertices and $m$ edges in $\tilde{O}(m+n^2/k)$ time which is an improvement for $k = Ω(n^{1/8})$ and works for fully dynamic graphs.

Min cost flow on unit capacity networks and convex cost K-flow are as easy as the assignment problem with All-Min-Cuts algorithm 2025-07-29
Show

We explore here surprising links between the time-cost-tradeoff problem and the minimum cost flow problem that lead to fast, strongly polynomial, algorithms for both problems. One of the main results is a new algorithm for the unit capacity min cost flow that matches the complexity of the fastest strongly polynomial algorithm known for the assignment problem. The time cost tradeoff (TCT) problem in project management is to expedite the durations of activities, subject to precedence constraints, in order to achieve a target project completion time at minimum expediting costs, or, to maximize the net benefit from a reward associated with project completion time reduction. Each activity is associated with integer normal duration, minimum duration, and expediting cost per unit reduction in duration. We devise here the {\em all-min-cuts} procedure, which for a given maximum flow, is capable of generating all minimum cuts (equivalent to minimum cost expediting) of equal value very efficiently. Equivalently, the procedure identifies all solutions that reside on the TCT curve between consecutive breakpoints in average $O(m+n \log n)$ time, where $m$ and $n$ are the numbers of arcs and nodes in the network. The all-min-cuts procedure implies faster algorithms for certain TCT problems and for certain min cost flow problem with a significantly different approach from the ones known.

Dual Charging for Half-Integral TSP 2025-07-24
Show

We show that the max entropy algorithm is a randomized 1.49776 approximation for half-integral TSP, improving upon the previous known bound of 1.49993 from Karlin et al. This also improves upon the best-known approximation for half-integral TSP due to Gupta et al. Our improvement results from using the dual, instead of the primal, to analyze the expected cost of the matching. We believe this method of analysis could lead to a simpler proof that max entropy is a better-than-3/2 approximation in the general case. We also give a 1.4671 approximation for half integral LP solutions with no proper minimum cuts and an even number of vertices, improving upon the bound of Haddadan and Newman of 1.476. We then extend the analysis to the case when there are an odd number of vertices $n$ at the cost of an additional $O(1/n)$ factor.

Fast Algorithms for Graph Arboricity and Related Problems 2025-07-21
Show

We give an algorithm for finding the arboricity of a weighted, undirected graph, defined as the minimum number of spanning forests that cover all edges of the graph, in $\sqrt{n} m^{1+o(1)}$ time. This improves on the previous best bound of $\tilde{O}(nm)$ for weighted graphs and $\tilde{O}(m^{3/2}) $ for unweighted graphs (Gabow 1995) for this problem. The running time of our algorithm is dominated by a logarithmic number of calls to a directed global minimum cut subroutine -- if the running time of the latter problem improves to $m^{1+o(1)}$ (thereby matching the running time of maximum flow), the running time of our arboricity algorithm would improve further to $m^{1+o(1)}$. We also give a new algorithm for computing the entire cut hierarchy -- laminar multiway cuts with minimum cut ratio in recursively defined induced subgraphs -- in $m n^{1+o(1)}$ time. The cut hierarchy yields the ideal edge loads (Thorup 2001) in a fractional spanning tree packing of the graph which, we show, also corresponds to a max-entropy solution in the spanning tree polytope. For the cut hierarchy problem, the previous best bound was $\tilde{O}(n^2 m)$ for weighted graphs and $\tilde{O}(n m^{3/2})$ for unweighted graphs.

FOCS ...

FOCS 2025. 25 pages, 3 figures

Bicriteria Polygon Aggregation with Arbitrary Shapes 2025-07-16
Show

We study the problem of aggregating polygons by covering them with disjoint representative regions, thereby inducing a clustering of the polygons. Our objective is to minimize a weighted sum of the total area and the total perimeter of the regions. This problem has applications in cartographic map generalization and urban analytics. Here, the polygons represent building footprints and the clusters may represent urban areas. Previous approaches forced the boundaries of the regions to come from a fixed subdivision of the plane, which allows the optimal solution (restricted in this way) to be found from a minimum cut in a dual graph. It is natural to ask whether the problem can still be solved efficiently if this restriction is removed, allowing output regions to be bounded by arbitrary curves. We provide a positive answer in the form of a polynomial-time algorithm. Additionally, we fully characterize the optimal solutions by showing that their boundaries are composed of input polygon edges and circular arcs of constant radius. Since some applications favor straight edges, we also study two problem variants in which the output regions must be polygons, but are not restricted to have boundaries from a fixed subdivision. In the first variant, region vertices must lie on the boundaries of the input polygons. The second variant requires them to be vertices of the input polygons. We show that both variants can be approximated up to a constant factor in polynomial time by altering an optimal solution for the unrestricted problem. Our experimental evaluation on real-world building footprints demonstrates that these approximate solutions are visually similar to the optimal unrestricted ones and achieve near-optimal objective values.

Approximating the Held-Karp Bound for Metric TSP in Nearly Linear Work and Polylogarithmic Depth 2025-06-23
Show

We present a nearly linear work parallel algorithm for approximating the Held-Karp bound for the Metric TSP problem. Given an edge-weighted undirected graph $G=(V,E)$ on $m$ edges and $ε>0$, it returns a $(1+ε)$-approximation to the Held-Karp bound with high probability, in $\tilde{O}(m/ε^4)$ work and $\tilde{O}(1/ε^4)$ depth. While a nearly linear time sequential algorithm was known for almost a decade (Chekuri and Quanrud'17), it was not known how to simultaneously achieve nearly linear work alongside polylogarithmic depth. Using a reduction by Chalermsook et al.'22, we also give a parallel algorithm for computing a $(1+ε)$-approximate fractional solution to the $k$-edge-connected spanning subgraph (kECSS) problem, with similar complexity. To obtain these results, we introduce a notion of core-sequences for the parallel Multiplicative Weights Update (MWU) framework (Luby-Nisan'93, Young'01). For the Metric TSP and kECSS problems, core-sequences enable us to exploit the structure of approximate minimum cuts to reduce the cost per iteration and/or the number of iterations. The acceleration technique via core-sequences is generic and of independent interest. In particular, it improves the best-known iteration complexity of MWU algorithms for packing/covering LPs from $poly(\log nnz(A))$ to polylogarithmic in the product of cardinalities of the core-sequence sets, where $A$ is the constraint matrix of the LP. For certain implicitly defined LPs such as the kECSS LP, this yields an exponential improvement in depth.

Breaking the O(mn)-Time Barrier for Vertex-Weighted Global Minimum Cut 2025-06-18
Show

We consider the Global Minimum Vertex-Cut problem: given an undirected vertex-weighted graph $G$, compute a minimum-weight subset of its vertices whose removal disconnects $G$. The problem is closely related to Global Minimum Edge-Cut, where the weights are on the graph edges instead of vertices, and the goal is to compute a minimum-weight subset of edges whose removal disconnects the graph. Global Minimum Cut is one of the most basic and extensively studied problems in combinatorial optimization and graph theory. While an almost-linear time algorithm was known for the edge version of the problem for awhile (Karger, STOC 1996 and J. ACM 2000), the fastest previous algorithm for the vertex version (Henzinger, Rao and Gabow, FOCS 1996 and J. Algorithms 2000) achieves a running time of $\tilde{O}(mn)$, where $m$ and $n$ denote the number of edges and vertices in the input graph, respectively. For the special case of unit vertex weights, this bound was broken only recently (Li {et al.}, STOC 2021); their result, combined with the recent breakthrough almost-linear time algorithm for Maximum $s$-$t$ Flow (Chen {et al.}, FOCS 2022, van den Brand {et al.}, FOCS 2023), yields an almost-linear time algorithm for Global Minimum Vertex-Cut with unit vertex weights. In this paper we break the $28$ years old bound of Henzinger {et al.} for the general weighted Global Minimum Vertex-Cut, by providing a randomized algorithm for the problem with running time $O(\min{mn^{0.99+o(1)},m^{1.5+o(1)}})$.

A Framework for the Design of Efficient Diversification Algorithms to NP-Hard Problems 2025-06-10
Show

There has been considerable recent interest in computing a diverse collection of solutions to a given optimization problem, both in the AI and theory communities. Given a classical optimization problem $Π$ (e.g., spanning tree, minimum cuts, maximum matching, minimum vertex cover) with input size $n$ and an integer $k\geq 1$, the goal is to generate a collection of $k$ maximally diverse solutions to $Π$. This diverse-X paradigm not only allows the user to generate very different solutions, but also helps make systems more secure and robust by handling uncertainty, and achieve energy efficiency. For problems $Π$ in P (such as spanning tree and minimum cut), there are efficient $\text{poly}(n,k)$ approximation algorithms available for the diverse variants [Hanaka et al. AAAI 2021, 2022, 2023, Gao et al. LATIN 2022, de Berg et al. ISAAC 2023]. In contrast, only FPT algorithms are known for NP-hard problems such as vertex covers and independent sets [Baste et al. IJCAI 2020, Eiben et al. SODA 2024, Misra et al. ISAAC 2024, Austrin et al. ICALP 2025], but in the worst case, these algorithms run in time $\exp((kn)^c)$ for some $c>0$. In this work, we address this gap and give $\text{poly}(n,k)$ or $f(k)\text{poly}(n)$ time approximation algorithms for diversification variants of several NP-hard problems such as knapsack, maximum weight independent sets (MWIS) and minimum vertex covers in planar graphs, geometric (rectangle) knapsack, enclosing points by polygon, and MWIS in unit-disk-graphs of points in convex position. Our results are achieved by developing a general framework and applying it to problems with textbook dynamic-programming algorithms to find one solution.

Spectral Clustering for Directed Graphs via Likelihood Estimation on Stochastic Block Models 2025-06-03
Show

Graph clustering is a fundamental task in unsupervised learning with broad real-world applications. While spectral clustering methods for undirected graphs are well-established and guided by a minimum cut optimization consensus, their extension to directed graphs remains relatively underexplored due to the additional complexity introduced by edge directions. In this paper, we leverage statistical inference on stochastic block models to guide the development of a spectral clustering algorithm for directed graphs. Specifically, we study the maximum likelihood estimation under a widely used directed stochastic block model, and derive a global objective function that aligns with the underlying community structure. We further establish a theoretical upper bound on the misclustering error of its spectral relaxation, and based on this relaxation, introduce a novel, self-adaptive spectral clustering method for directed graphs. Extensive experiments on synthetic and real-world datasets demonstrate significant performance gains over existing baselines.

Near-Optimal Minimum Cuts in Hypergraphs at Scale 2025-04-30
Show

The hypergraph minimum cut problem aims to partition its vertices into two blocks while minimizing the total weight of the cut hyperedges. This fundamental problem arises in network reliability, VLSI design, and community detection. We present HeiCut, a scalable algorithm for computing near-optimal minimum cuts in both unweighted and weighted hypergraphs. HeiCut aggressively reduces the hypergraph size through a sequence of provably exact reductions that preserve the minimum cut, along with an optional heuristic contraction based on label propagation. It then solves a relaxed Binary Integer Linear Program (BIP) on the reduced hypergraph to compute a near-optimal minimum cut. Our extensive evaluation on over 500 real-world hypergraphs shows that HeiCut computes the exact minimum cut in over 85% of instances using our exact reductions alone, and offers the best solution quality across all instances. It solves over twice as many instances as the state-of-the-art within set computational limits, and is up to five orders of magnitude faster.

The Case for External Graph Sketching 2025-04-24
Show

Algorithms in the data stream model use $O(polylog(N))$ space to compute some property of an input of size $N$, and many of these algorithms are implemented and used in practice. However, sketching algorithms in the graph semi-streaming model use $O(V polylog(V))$ space for a $V$-vertex graph, and the fact that implementations of these algorithms are not used in the academic literature or in industrial applications may be because this space requirement is too large for RAM on today's hardware. In this paper we introduce the external semi-streaming model, which addresses the aspects of the semi-streaming model that limit its practical impact. In this model, the input is in the form of a stream and $O(V polylog(V))$ space is available, but most of that space is accessible only via block I/O operations as in the external memory model. The goal in the external semi-streaming model is to simultaneously achieve small space and low I/O cost. We present a general transformation from any vertex-based sketch algorithm to one which has a low sketching cost in the new model. We prove that this automatic transformation is tight or nearly (up to a $O(\log(V))$ factor) tight via an I/O lower bound for the task of sketching the input stream. Using this transformation and other techniques, we present external semi-streaming algorithms for connectivity, bipartiteness testing, $(1+ε)$-approximating MST weight, testing k-edge connectivity, $(1+ε)$-approximating the minimum cut of a graph, computing $ε$-cut sparsifiers, and approximating the density of the densest subgraph. These algorithms all use $O(V poly(\log(V), ε^{-1},k)$ space. For many of these problems, our external semi-streaming algorithms outperform the state of the art algorithms in both the sketching and external-memory models.

Full ...

Full version for paper to appear in ACDA proceedings

Multicut Problems in Embedded Graphs: The Dependency of Complexity on the Demand Pattern 2025-04-15
Show

The Multicut problem asks for a minimum cut separating certain pairs of vertices: formally, given a graph $G$ and demand graph $H$ on a set $T\subseteq V(G)$ of terminals, the task is to find a minimum-weight set $C$ of edges of $G$ such that whenever two vertices of $T$ are adjacent in $H$, they are in different components of $G\setminus C$. Colin de Verdière [Algorithmica, 2017] showed that Multicut with $t$ terminals on a graph $G$ of genus $g$ can be solved in time $f(t,g)n^{O(\sqrt{g^2+gt+t})}$. Cohen-Addad et al. [JACM, 2021] proved a matching lower bound showing that the exponent of $n$ is essentially best possible (for every fixed value of $t$ and $g$), even in the special case of Multiway Cut, where the demand graph $H$ is a complete graph. However, this lower bound tells us nothing about other special cases of Multicut such as Group 3-Terminal Cut (where three groups of terminals need to be separated from each other). We show that if the demand pattern is, in some sense, close to being a complete bipartite graph, then Multicut can be solved faster than $f(t,g)n^{O(\sqrt{g^2+gt+t})}$, and furthermore this is the only property that allows such an improvement. Formally, for a class $\mathcal{H}$ of graphs, Multicut$(\mathcal{H})$ is the special case where the demand graph $H$ is in $\mathcal{H}$. For every fixed class $\mathcal{H}$ (satisfying some mild closure property), fixed $g$, and fixed $t$, our main result gives tight upper and lower bounds on the exponent of $n$ in algorithms solving Multicut$(\mathcal{H})$.

Minimum Cut Representability of Stable Matching Problems 2025-04-06
Show

We introduce and study Minimum Cut Representability, a framework to solve optimization and feasibility problems over stable matchings by representing them as minimum s-t cut problems on digraphs over rotations. We provide necessary and sufficient conditions on objective functions and feasibility sets for problems to be minimum cut representable. In particular, we define the concepts of first and second order differentials of a function over stable matchings and show that a problem is minimum cut representable if and only if, roughly speaking, the objective function can be expressed solely using these differentials, and the feasibility set is a sublattice of the stable matching lattice. To demonstrate the practical relevance of our framework, we study a range of real-world applications, including problems involving school choice with siblings and a two-stage stochastic stable matching problem. We show how our framework can be used to help solving these problems.

Network Unreliability in Almost-Linear Time 2025-03-30
Show

The network unreliability problem asks for the probability that a given undirected graph gets disconnected when every edge independently fails with a given probability $p$. Valiant (1979) showed that this problem is #P-hard; therefore, the best we can hope for are approximation algorithms. In a classic result, Karger (1995) obtained the first FPTAS for this problem by leveraging the fact that when a graph disconnects, it almost always does so at a near-minimum cut, and there are only a small (polynomial) number of near-minimum cuts. Since then, a series of results have obtained progressively faster algorithms to the current bound of $m^{1+o(1)} + \tilde{O}(n^{3/2})$ (Cen, He, Li, and Panigrahi, 2024). In this paper, we obtain an $m^{1+o(1)}$-time algorithm for the network unreliability problem. This is essentially optimal, since we need $O(m)$ time to read the input graph. Our main new ingredient is relating network unreliability to an {\em ideal} tree packing of spanning trees (Thorup, 2001).

To ap...

To appear in STOC 2025

Faster Global Minimum Cut with Predictions 2025-03-06
Show

Global minimum cut is a fundamental combinatorial optimization problem with wide-ranging applications. Often in practice, these problems are solved repeatedly on families of similar or related instances. However, the de facto algorithmic approach is to solve each instance of the problem from scratch discarding information from prior instances. In this paper, we consider how predictions informed by prior instances can be used to warm-start practical minimum cut algorithms. The paper considers the widely used Karger's algorithm and its counterpart, the Karger-Stein algorithm. Given good predictions, we show these algorithms become near-linear time and have robust performance to erroneous predictions. Both of these algorithms are randomized edge-contraction algorithms. Our natural idea is to probabilistically prioritize the contraction of edges that are unlikely to be in the minimum cut.

An O(log n)-Approximation Algorithm for (p,q)-Flexible Graph Connectivity via Independent Rounding 2025-01-22
Show

In the $(p,q)$-Flexible Graph Connectivity problem, the input is a graph $G = (V,E)$ with the edge set $E = \mathscr{S} \cup \mathscr{U}$ partitioned into safe and unsafe edges, and the goal is to find a minimum cost set of edges $F$ such that the subgraph $(V,F)$ remains $p$-edge-connected after removing any $q$ unsafe edges from $F$. We give a new integer programming formulation for the problem, by adding knapsack cover constraints to the $p(p+q)$-connected capacitated edge-connectivity formulation studied in previous work, and show that the corresponding linear relaxation can be solved in polynomial time by giving an efficient separation oracle. Further, we show that independent randomized rounding yields an $O(\log n)$-approximation for arbitrary values of $p$ and $q$, improving the state-of-the-art $O(q\log n)$. For both separation and rounding, a key insight is to use Karger's bound on the number of near-minimum cuts.

11 pages
Fully Dynamic Approximate Minimum Cut in Subpolynomial Time per Operation 2025-01-06
Show

Dynamically maintaining the minimum cut in a graph $G$ under edge insertions and deletions is a fundamental problem in dynamic graph algorithms for which no conditional lower bound on the time per operation exists. In an $n$-node graph the best known $(1+o(1))$-approximate algorithm takes $\tilde O(\sqrt{n})$ update time [Thorup 2007]. If the minimum cut is guaranteed to be $(\log n)^{o(1)}$, a deterministic exact algorithm with $n^{o(1)}$ update time exists [Jin, Sun, Thorup 2024]. We present the first fully dynamic algorithm for $(1+o(1))$-approximate minimum cut with $n^{o(1)}$ update time. Our main technical contribution is to show that it suffices to consider small-volume cuts in suitably contracted graphs.

To ap...

To appear at SODA2025

Space Complexity of Minimum Cut Problems in Single-Pass Streams 2024-12-06
Show

We consider the problem of finding a minimum cut of a weighted graph presented as a single-pass stream. While graph sparsification in streams has been intensively studied, the specific application of finding minimum cuts in streams is less well-studied. To this end, we show upper and lower bounds on minimum cut problems in insertion-only streams for a variety of settings, including for both randomized and deterministic algorithms, for both arbitrary and random order streams, and for both approximate and exact algorithms. One of our main results is an $\widetilde{O}(n/\varepsilon)$ space algorithm with fast update time for approximating a spectral cut query with high probability on a stream given in an arbitrary order. Our result breaks the $Ω(n/\varepsilon^2)$ space lower bound required of a sparsifier that approximates all cuts simultaneously. Using this result, we provide streaming algorithms with near optimal space of $\widetilde{O}(n/\varepsilon)$ for minimum cut and approximate all-pairs effective resistances, with matching space lower-bounds. The amortized update time of our algorithms is $\widetilde{O}(1)$, provided that the number of edges in the input graph is at least $(n/\varepsilon^2)^{1+o(1)}$. We also give a generic way of incorporating sketching into a recursive contraction algorithm to improve the post-processing time of our algorithms. In addition to these results, we give a random-order streaming algorithm that computes the {\it exact} minimum cut on a simple, unweighted graph using $\widetilde{O}(n)$ space. Finally, we give an $Ω(n/\varepsilon^2)$ space lower bound for deterministic minimum cut algorithms which matches the best-known upper bound up to polylogarithmic factors.

25+3 ...

25+3 pages, 2 figures. Accepted to ITCS 2025. v2: minor updates to author information

Tree-Packing Revisited: Faster Fully Dynamic Min-Cut and Arboricity 2024-12-04
Show

A tree-packing is a collection of spanning trees of a graph. It has been a useful tool for computing the minimum cut in static, dynamic, and distributed settings. In particular, [Thorup, Comb. 2007] used them to obtain his dynamic min-cut algorithm with $\tilde O(λ^{14.5}\sqrt{n})$ worst-case update time. We reexamine this relationship, showing that we need to maintain fewer spanning trees for such a result; we show that we only need to pack $Θ(λ^3 \log m)$ greedy trees to guarantee a 1-respecting cut or a trivial cut in some contracted graph. Based on this structural result, we then provide a deterministic algorithm for fully dynamic exact min-cut, that has $\tilde O(λ^{5.5}\sqrt{n})$ worst-case update time, for min-cut value bounded by $λ$. In particular, this also leads to an algorithm for general fully dynamic exact min-cut with $\tilde O(m^{1-1/12})$ amortized update time, improving upon $\tilde O(m^{1-1/31})$ [Goranci et al., SODA 2023]. We also give the first fully dynamic algorithm that maintains a $(1+\varepsilon)$-approximation of the fractional arboricity -- which is strictly harder than the integral arboricity. Our algorithm is deterministic and has $O(α\log^6m/\varepsilon^4)$ amortized update time, for arboricity at most $α$. We extend these results to a Monte Carlo algorithm with $O(\text{poly}(\log m,\varepsilon^{-1}))$ amortized update time against an adaptive adversary. Our algorithms work on multi-graphs as well. Both result are obtained by exploring the connection between the min-cut/arboricity and (greedy) tree-packing. We investigate tree-packing in a broader sense; including a lower bound for greedy tree-packing, which - to the best of our knowledge - is the first progress on this topic since [Thorup, Comb. 2007].

To be...

To be presented at SODA '25

Linear-Time Algorithms for k-Edge-Connected Components, k-Lean Tree Decompositions, and More 2024-11-04
Show

We present $k^{O(k^2)} m$ time algorithms for various problems about decomposing a given undirected graph by edge cuts or vertex separators of size $<k$ into parts that are ``well-connected'' with respect to cuts or separators of size $<k$; here, $m$ is the total number of vertices and edges of the graph. As an application of our results, we obtain for every fixed $k$ a linear-time algorithm for computing the $k$-edge-connected components of a given graph, solving a long-standing open problem. More generally, we obtain a $k^{O(k^2)} m$ time algorithm for computing a $k$-Gomory-Hu tree of a given graph, which is a structure representing pairwise minimum cuts of size $<k$. Our main technical result, from which the other results follow, is a $k^{O(k^2)} m$ time algorithm for computing a $k$-lean tree decomposition of a given graph. This is a tree decomposition with adhesion size $<k$ that captures the existence of separators of size $<k$ between subsets of its bags. A $k$-lean tree decomposition is also an unbreakable tree decomposition with optimal unbreakability parameters for the adhesion size bound $k$. As further applications, we obtain $k^{O(k^2)} m$ time algorithms for $k$-vertex connectivity and for element connectivity $k$-Gomory-Hu tree. All of our algorithms are deterministic. Our techniques are inspired by the tenth paper of the Graph Minors series of Robertson and Seymour and by Bodlaender's parameterized linear-time algorithm for treewidth.

107 pages
A Simpler Approach for Monotone Parametric Minimum Cut: Finding the Breakpoints in Order 2024-10-21
Show

We present parametric breadth-first search (PBFS), a new algorithm for solving the parametric minimum cut problem in a network with source-sink-monotone capacities. The objective is to find the set of breakpoints, i.e., the points at which the minimum cut changes. It is well known that this problem can be solved in the same asymptotic runtime as the static minimum cut problem. However, existing algorithms that achieve this runtime bound involve fairly complicated steps that are inefficient in practice. PBFS uses a simpler approach that discovers the breakpoints in ascending order, which allows it to achieve the desired runtime bound while still performing well in practice. We evaluate our algorithm on benchmark instances from polygon aggregation and computer vision. Polygon aggregation was recently proposed as an application for parametric minimum cut, but the monotonicity property has not been exploited fully. PBFS outperforms the state of the art on most benchmark instances, usually by a factor of 2-3. It is particularly strong on instances with many breakpoints, which is the case for polygon aggregation. Compared to the existing min-cut-based approach for polygon aggregation, PBFS scales much better with the instance size. On large instances with millions of vertices, it is able to compute all breakpoints in a matter of seconds.

Graph Cuts with Arbitrary Size Constraints Through Optimal Transport 2024-10-04
Show

A common way of partitioning graphs is through minimum cuts. One drawback of classical minimum cut methods is that they tend to produce small groups, which is why more balanced variants such as normalized and ratio cuts have seen more success. However, we believe that with these variants, the balance constraints can be too restrictive for some applications like for clustering of imbalanced datasets, while not being restrictive enough for when searching for perfectly balanced partitions. Here, we propose a new graph cut algorithm for partitioning graphs under arbitrary size constraints. We formulate the graph cut problem as a Gromov-Wasserstein with a concave regularizer problem. We then propose to solve it using an accelerated proximal GD algorithm which guarantees global convergence to a critical point, results in sparse solutions and only incurs an additional ratio of $\mathcal{O}(\log(n))$ compared to the classical spectral clustering algorithm but was seen to be more efficient.

Publi...

Published in Transactions on Machine Learning Research

Optimal Sensitivity Oracle for Steiner Mincut 2024-09-26
Show

Let $G=(V,E)$ be an undirected weighted graph on $n=

V
Mimicking Networks for Constrained Multicuts in Hypergraphs 2024-09-20
Show

In this paper, we study a \emph{multicut-mimicking network} for a hypergraph over terminals $T$ with a parameter $c$. It is a hypergraph preserving the minimum multicut values of any set of pairs over $T$ where the value is at most $c$. This is a new variant of the multicut-mimicking network of a graph in [Wahlström ICALP'20], which introduces a parameter $c$ and extends it to handle hypergraphs. Additionally, it is a natural extension of the \emph{connectivity-$c$ mimicking network} introduced by [Chalermsook et al. SODA'21] and [Jiang et al. ESA'22] that is a (hyper)graph preserving the minimum cut values between two subsets of terminals where the value is at most $c$. We propose an algorithm for a hypergraph that returns a multicut-mimicking network over terminals $T$ with a parameter $c$ having $

T
Finding Diverse Minimum s-t Cuts 2024-09-18
Show

Recently, many studies have been devoted to finding diverse solutions in classical combinatorial problems, such as Vertex Cover (Baste et al., IJCAI'20), Matching (Fomin et al., ISAAC'20) and Spanning Tree (Hanaka et al., AAAI'21). We initiate the algorithmic study of $k$-Diverse Minimum s-t Cuts which, given a directed graph $G = (V, E)$, two specified vertices $s,t \in V$, and an integer $k > 0$, asks for a collection of $k$ minimum $s$-$t$ cuts in $G$ that has maximum diversity. We investigate the complexity of the problem for maximizing three diversity measures that can be applied to a collection of cuts: (i) the sum of all pairwise Hamming distances, (ii) the cardinality of the union of cuts in the collection, and (iii) the minimum pairwise Hamming distance. We prove that $k$-Diverse Minimum s-t Cuts can be solved in strongly polynomial time for diversity measures (i) and (ii) via submodular function minimization. We obtain this result by establishing a connection between ordered collections of minimum $s$-$t$ cuts and the theory of distributive lattices. When restricted to finding only collections of mutually disjoint solutions, we provide a more practical algorithm that finds a maximum set of pairwise disjoint minimum $s$-$t$ cuts. For graphs with small minimum $s$-$t$ cut, it runs in the time of a single max-flow computation. Our results stand in contrast to the problem of finding $k$ diverse global minimum cuts -- which is known to be NP-hard even for the disjoint case (Hanaka et al., AAAI'23) -- and partially answer a long-standing open question of Wagner (Networks, 1990) about improving the complexity of finding disjoint collections of minimum $s$-$t$ cuts. Lastly, we show that $k$-Diverse Minimum s-t Cuts subject to diversity measure (iii) is NP-hard already for $k=3$.

An ea...

An earlier version of this work appeared at the 34th International Symposium on Algorithms and Computation (ISAAC 2023). Corrected typos in Section 3 and revised arguments in Section 4. Results unchanged. Added new complexity results in Section 5. Readded missing acknowledgments section

The Parameterized Complexity Landscape of Two-Sets Cut-Uncut 2024-08-24
Show

In Two-Sets Cut-Uncut, we are given an undirected graph $G=(V,E)$ and two terminal sets $S$ and $T$. The task is to find a minimum cut $C$ in $G$ (if there is any) separating $S$ from $T$ under the following ``uncut'' condition. In the graph $(V,E \setminus C)$, the terminals in each terminal set remain in the same connected component. In spite of the superficial similarity to the classic problem Minimum $s$-$t$-Cut, Two-Sets Cut-Uncut is computationally challenging. In particular, even deciding whether such a cut of any size exists, is already NP-complete. We initiate a systematic study of Two-Sets Cut-Uncut within the context of parameterized complexity. By leveraging known relations between many well-studied graph parameters, we characterize the structural properties of input graphs that allow for polynomial kernels, fixed-parameter tractability (FPT), and slicewise polynomial algorithms (XP). Our main contribution is the near-complete establishment of the complexity of these algorithmic properties within the described hierarchy of graph parameters. On a technical level, our main results are fixed-parameter tractability for the (vertex-deletion) distance to cographs and an OR-cross composition excluding polynomial kernels for the vertex cover number of the input graph (under the standard complexity assumption NP is not contained in coNP/poly).

Interdiction of minimum spanning trees and other matroid bases 2024-07-20
Show

In the minimum spanning tree (MST) interdiction problem, we are given a graph $G=(V,E)$ with edge weights, and want to find some $X\subseteq E$ satisfying a knapsack constraint such that the MST weight in $(V,E\setminus X)$ is maximized. Since MSTs of $G$ are the minimum weight bases in the graphic matroid of $G$, this problem is a special case of matroid interdiction on a matroid $M=(E,\mathcal{I})$, in which the objective is instead to maximize the minimum weight of a basis of $M$ which is disjoint from $X$. By reduction from 0-1 knapsack, matroid interdiction is NP-complete, even for uniform matroids. We develop a new exact algorithm to solve the matroid interdiction problem. One of the key components of our algorithm is a dynamic programming upper bound which only requires that a simpler discrete derivative problem can be calculated/approximated for the given matroid. Our exact algorithm then uses this bound within a custom branch-and-bound algorithm. For different matroids, we show how this discrete derivative can be calculated/approximated. In particular, for partition matroids, this yields a pseudopolynomial time algorithm. For graphic matroids, an approximation can be obtained by solving a sequence of minimum cut problems, which we apply to the MST interdiction problem. The running time of our algorithm is asymptotically faster than the best known MST interdiction algorithm, up to polylog factors. Furthermore, our algorithm achieves state-of-the-art computational performance: we solved all available instances from the literature, and in many cases reduced the best running time from hours to seconds.

29 pages, 2 figures
Tight Lower Bounds for Directed Cut Sparsification and Distributed Min-Cut 2024-06-19
Show

In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors. The first problem is to approximate cuts in balanced directed graphs. In this problem, the goal is to build a data structure that $(1 \pm ε)$-approximates cut values in graphs with $n$ vertices. For arbitrary directed graphs, such a data structure requires $Ω(n^2)$ bits even for constant $ε$. To circumvent this, recent works study $β$-balanced graphs, meaning that for every directed cut, the total weight of edges in one direction is at most $β$ times that in the other direction. We consider two models: the {\em for-each} model, where the goal is to approximate each cut with constant probability, and the {\em for-all} model, where all cuts must be preserved simultaneously. We improve the previous $Ω(n \sqrt{β/ε})$ lower bound to $\tildeΩ(n \sqrtβ/ε)$ in the for-each model, and we improve the previous $Ω(n β/ε)$ lower bound to $Ω(n β/ε^2)$ in the for-all model. This resolves the main open questions of (Cen et al., ICALP, 2021). The second problem is to approximate the global minimum cut in a local query model, where we can only access the graph via degree, edge, and adjacency queries. We improve the previous $Ω\bigl(\frac{m}{k}\bigr)$ query complexity lower bound to $Ω\bigl(\min{m, \frac{m}{ε^2 k}}\bigr)$ for this problem, where $m$ is the number of edges, $k$ is the size of the minimum cut, and we seek a $(1+ε)$-approximation. In addition, we show that existing upper bounds with slight modifications match our lower bound up to logarithmic factors.

Fast Broadcast in Highly Connected Networks 2024-04-19
Show

We revisit the classic broadcast problem, wherein we have $k$ messages, each composed of $O(\log{n})$ bits, distributed arbitrarily across a network. The objective is to broadcast these messages to all nodes in the network. In the distributed CONGEST model, a textbook algorithm solves this problem in $O(D+k)$ rounds, where $D$ is the diameter of the graph. While the $O(D)$ term in the round complexity is unavoidable$\unicode{x2014}$given that $Ω(D)$ rounds are necessary to solve broadcast in any graph$\unicode{x2014}$it remains unclear whether the $O(k)$ term is needed in all graphs. In cases where the minimum cut size is one, simply transmitting messages from one side of the cut to the other would require $Ω(k)$ rounds. However, if the size of the minimum cut is larger, it may be possible to develop faster algorithms. This motivates the exploration of the broadcast problem in networks with high edge connectivity. In this work, we present a simple randomized distributed algorithm for performing $k$-message broadcast in $O(((n+k)/λ)\log n)$ rounds in any $n$-node simple graph with edge connectivity $λ$. When $k = Ω(n)$, our algorithm is universally optimal, up to an $O(\log n)$ factor, as its complexity nearly matches an information-theoretic $Ω(k/λ)$ lower bound that applies to all graphs, even when the network topology is known to the algorithm. The setting $k = Ω(n)$ is particularly interesting because several fundamental problems can be reduced to broadcasting $Ω(n)$ messages. Our broadcast algorithm finds several applications in distributed computing, enabling $O(1)$-approximation for all distances and $(1+ε)$-approximation for all cut sizes in $\tilde{O}(n/λ)$ rounds.

Engineering A Workload-balanced Push-Relabel Algorithm for Massive Graphs on GPUs 2024-03-30
Show

The push-relabel algorithm is an efficient algorithm that solves the maximum flow/ minimum cut problems of its affinity to parallelization. As the size of graphs grows exponentially, researchers have used Graphics Processing Units (GPUs) to accelerate the computation of the push-relabel algorithm further. However, prior works need to handle the significant memory consumption to represent a massive residual graph. In addition, the nature of their algorithms has inherently imbalanced workload distribution on GPUs. This paper first identifies the two challenges with the memory and computational models. Based on the analysis of these models, we propose a workload-balanced push-relabel algorithm (WBPR) with two enhanced compressed sparse representations (CSR) and a vertex-centric approach. The enhanced CSR significantly reduces memory consumption, while the vertex-centric approach alleviates the workload imbalance and improves the utilization of the GPU. In the experiment, our approach reduces the memory consumption from O(V^2) to O(V + E). Moreover, we can achieve up to 7.31x and 2.29x runtime speedup compared to the state-of-the-art on real-world graphs in maximum flow and bipartite matching tasks, respectively. Our code will be open-sourced for further research on accelerating the push-relabel algorithm.

Multi-Agent Team Access Monitoring: Environments that Benefit from Target Information Sharing 2024-03-28
Show

Robotic access monitoring of multiple target areas has applications including checkpoint enforcement, surveillance and containment of fire and flood hazards. Monitoring access for a single target region has been successfully modeled as a minimum-cut problem. We generalize this model to support multiple target areas using two approaches: iterating on individual targets and examining the collections of targets holistically. Through simulation we measure the performance of each approach on different scenarios.

A sublinear query quantum algorithm for s-t minimum cut on dense simple graphs 2024-02-05
Show

An $s{\operatorname{-}}t$ minimum cut in a graph corresponds to a minimum weight subset of edges whose removal disconnects vertices $s$ and $t$. Finding such a cut is a classic problem that is dual to that of finding a maximum flow from $s$ to $t$. In this work we describe a quantum algorithm for the minimum $s{\operatorname{-}}t$ cut problem on undirected graphs. For an undirected graph with $n$ vertices, $m$ edges, and integral edge weights bounded by $W$, the algorithm computes with high probability the weight of a minimum $s{\operatorname{-}}t$ cut after $\widetilde O(\sqrt{m} n^{5/6} W^{1/3})$ queries to the adjacency list of $G$. For simple graphs this bound is always $\widetilde O(n^{11/6})$, even in the dense case when $m = Ω(n^2)$. In contrast, a randomized algorithm must make $Ω(m)$ queries to the adjacency list of a simple graph $G$ even to decide whether $s$ and $t$ are connected.

The p...

The proof of the upper bound on the time complexity in the first arXiv version contained a fatal flaw. In this version we remove the claim about time complexity and prove the result only for query complexity

Cactus Representation of Minimum Cuts: Derandomize and Speed up 2024-01-19
Show

Given an undirected weighted graph with $n$ vertices and $m$ edges, we give the first deterministic $m^{1+o(1)}$-time algorithm for constructing the cactus representation of \emph{all} global minimum cuts. This improves the current $n^{2+o(1)}$-time state-of-the-art deterministic algorithm, which can be obtained by combining ideas implicitly from three papers [Karger JACM'2000, Li STOC'2021, and Gabow TALG'2016] The known explicitly stated deterministic algorithm has a runtime of $\tilde{O}(mn)$ [Fleischer 1999, Nagamochi and Nakao 2000]. Using our technique, we can even speed up the fastest randomized algorithm of [Karger and Panigrahi, SODA'2009] whose running time is at least $Ω(m\log^4 n)$ to $O(m\log^3 n)$.

SODA 2024
Fully Dynamic Min-Cut of Superconstant Size in Subpolynomial Time 2024-01-18
Show

We present a deterministic fully dynamic algorithm with subpolynomial worst-case time per graph update such that after processing each update of the graph, the algorithm outputs a minimum cut of the graph if the graph has a cut of size at most $c$ for some $c = (\log n)^{o(1)}$. Previously, the best update time was $\widetilde O(\sqrt{n})$ for any $c > 2$ and $c = O(\log n)$ [Thorup, Combinatorica'07].

SODA 2024
Detachment Problem -- Application in Prevention of Information Leakage in Stock Markets 2024-01-13
Show

In this paper, we introduce the Detachment Problem. It can be seen as a generalized Vaccination Problem. The aim is to optimally cut the individuals' ties to circles that connect them to others, to minimize the overall information transfer in a social network. When an individual is isolated from a particular circle, it leads to the elimination of the connections to all the members of that circle, yet the connections to other circles remain. This approach contrasts with the conventional vaccination problem, in which a subset of vertices is totally eliminated. In our case, the connections of individuals to their circles are selectively, rather than entirely, eliminated. Contextually, this article focuses on private information flows, specifically within networks formed by memberships in circles of insiders in companies. Our quasi-empirical study uses simulated information flows on an observable network, and the statistical properties of the simulated information flows are matched with real-world data. In a broader context, this paper presents the Detachment Problem as a versatile approach for optimal social distancing, applicable across various scenarios. We propose and define a concept of expected proportional outside influence, or EPOI, as measure of how widespread information leak is. We also implement a greedy algorithm for finding a set of detachments to minimize EPOI. For comparison, we devise a simple heuristic based on minimal cut, to separate the most influential circles from each other. We provide evidence that the greedy algorithm is not optimal, and it is sometimes outperformed by the simple heuristic minimum cut algorithm, However, the greedy algorithm outperforms the cut algorithm in most cases. Further avenues of research are discussed.

Deterministic Near-Linear Time Minimum Cut in Weighted Graphs 2024-01-11
Show

In 1996, Karger [Kar96] gave a startling randomized algorithm that finds a minimum-cut in a (weighted) graph in time $O(m\log^3n)$ which he termed near-linear time meaning linear (in the size of the input) times a polylogarthmic factor. In this paper, we give the first deterministic algorithm which runs in near-linear time for weighted graphs. Previously, the breakthrough results of Kawarabayashi and Thorup [KT19] gave a near-linear time algorithm for simple graphs. The main technique here is a clustering procedure that perfectly preserves minimum cuts. Recently, Li [Li21] gave an $m^{1+o(1)}$ deterministic minimum-cut algorithm for weighted graphs; this form of running time has been termed "almost-linear''. Li uses almost-linear time deterministic expander decompositions which do not perfectly preserve minimum cuts, but he can use these clusterings to, in a sense, "derandomize'' the methods of Karger. In terms of techniques, we provide a structural theorem that says there exists a sparse clustering that preserves minimum cuts in a weighted graph with $o(1)$ error. In addition, we construct it deterministically in near linear time. This was done exactly for simple graphs in [KT19, HRW20] and with polylogarithmic error for weighted graphs in [Li21]. Extending the techniques in [KT19, HRW20] to weighted graphs presents significant challenges, and moreover, the algorithm can only polylogarithmically approximately preserve minimum cuts. A remaining challenge is to reduce the polylogarithmic-approximate clusterings to $1+o(1/\log n)$-approximate so that they can be applied recursively as in [Li21] over $O(\log n)$ many levels. This is an additional challenge that requires building on properties of tree-packings in the presence of a wide range of edge weights to, for example, find sources for local flow computations which identify minimum cuts that cross clusters.

SODA 2024, 60 pages
Robust Zero Level-Set Extraction from Unsigned Distance Fields Based on Double Covering 2024-01-10
Show

In this paper, we propose a new method, called DoubleCoverUDF, for extracting the zero level-set from unsigned distance fields (UDFs). DoubleCoverUDF takes a learned UDF and a user-specified parameter $r$ (a small positive real number) as input and extracts an iso-surface with an iso-value $r$ using the conventional marching cubes algorithm. We show that the computed iso-surface is the boundary of the $r$-offset volume of the target zero level-set $S$, which is an orientable manifold, regardless of the topology of $S$. Next, the algorithm computes a covering map to project the boundary mesh onto $S$, preserving the mesh's topology and avoiding folding. If $S$ is an orientable manifold surface, our algorithm separates the double-layered mesh into a single layer using a robust minimum-cut post-processing step. Otherwise, it keeps the double-layered mesh as the output. We validate our algorithm by reconstructing 3D surfaces of open models and demonstrate its efficacy and effectiveness on synthetic models and benchmark datasets. Our experimental results confirm that our method is robust and produces meshes with better quality in terms of both visual evaluation and quantitative measures than existing UDF-based methods. The source code is available at https://github.com/jjjkkyz/DCUDF.

publi...

published in ACM Transactions on Graphics (SIGGRAPH Asia 2023)

Minimum 0-Extension Problems on Directed Metrics 2024-01-04
Show

For a metric $μ$ on a finite set $T$, the minimum 0-extension problem 0-Ext$[μ]$ is defined as follows: Given $V\supseteq T$ and $\ c:{V \choose 2}\rightarrow \mathbf{Q_+}$, minimize $\sum c(xy)μ(γ(x),γ(y))$ subject to $γ:V\rightarrow T,\ γ(t)=t\ (\forall t\in T)$, where the sum is taken over all unordered pairs in $V$. This problem generalizes several classical combinatorial optimization problems such as the minimum cut problem or the multiterminal cut problem. Karzanov and Hirai established a complete classification of metrics $μ$ for which 0-Ext$[μ]$ is polynomial time solvable or NP-hard. This result can also be viewed as a sharpening of the general dichotomy theorem for finite-valued CSPs (Thapper and Živný 2016) specialized to 0-Ext$[μ]$. In this paper, we consider a directed version $\overrightarrow{0}$-Ext$[μ]$ of the minimum 0-extension problem, where $μ$ and $c$ are not assumed to be symmetric. We extend the NP-hardness condition of 0-Ext$[μ]$ to $\overrightarrow{0}$-Ext$[μ]$: If $μ$ cannot be represented as the shortest path metric of an orientable modular graph with an orbit-invariant ``directed'' edge-length, then $\overrightarrow{0}$-Ext$[μ]$ is NP-hard. We also show a partial converse: If $μ$ is a directed metric of a modular lattice with an orbit-invariant directed edge-length, then $\overrightarrow{0}$-Ext$[μ]$ is tractable. We further provide a new NP-hardness condition characteristic of $\overrightarrow{0}$-Ext$[μ]$, and establish a dichotomy for the case where $μ$ is a directed metric of a star.

Cactus Representations in Polylogarithmic Max-flow via Maximal Isolating Mincuts 2023-11-17
Show

A cactus representation of a graph, introduced by Dinitz et al. in 1976, is an edge sparsifier of $O(n)$ size that exactly captures all global minimum cuts of the graph. It is a central combinatorial object that has been a key ingredient in almost all algorithms for the connectivity augmentation problems and for maintaining minimum cuts under edge insertions (e.g. [NGM97], [CKL+22], [Hen97]). This sparsifier was generalized to Steiner cactus for a vertex set $T$, which can be seen as a vertex sparsifier of $O(

T
Local algorithms for the maximum flow and minimum cut in bounded-degree networks 2023-11-02
Show

We show a deterministic constant-time local algorithm for constructing an approximately maximum flow and minimum fractional cut in multisource-multitarget networks with bounded degrees and bounded edge capacities. Locality means that the decision we make about each edge only depends on its constant radius neighborhood. We show two applications of the algorithms: one is related to the Aldous-Lyons Conjecture, and the other is about approximating the neighborhood distribution of graphs by bounded-size graphs. The scope of our results can be extended to unimodular random graphs and networks. As a corollary, we generalize the Maximum Flow Minimum Cut Theorem to unimodular random flow networks.

Improved Approximations for Relative Survivable Network Design 2023-10-03
Show

One of the most important and well-studied settings for network design is edge-connectivity requirements. This encompasses uniform demands such as the Minimum $k$-Edge-Connected Spanning Subgraph problem as well as nonuniform demands such as the Survivable Network Design problem (SND). In a recent paper by [Dinitz, Koranteng, Kortsarz APPROX '22] , the authors observed that a weakness of these formulations is that it does not enable one to consider fault-tolerance in graphs that have just a few small cuts. To remedy this, they introduced new variants of these problems under the notion "relative" fault-tolerance. Informally, this requires not that two nodes are connected if there are a bounded number of faults (as in the classical setting), but that two nodes are connected if there are a bounded number of faults and the two nodes are connected in the underlying graph post-faults. The problem is already highly non-trivial even for the case of a single demand. Due to difficulties introduced by this new notion of fault-tolerance, the results in [Dinitz, Koranteng, Kortsarz APPROX '22] are quite limited. For the Relative Survivable Network Design problem (RSND), when the demands are not uniform they give a nontrivial result only when there is a single demand with a connectivity requirement of $3$: a non-optimal $27/4$-approximation. We strengthen this result in two significant ways: We give a $2$-approximation for RSND where all requirements are at most $3$, and a $2^{O(k^2)}$-approximation for RSND with a single demand of arbitrary value $k$. To achieve these results, we first use the "cactus representation'' of minimum cuts to give a lossless reduction to normal SND. Second, we extend the techniques of [Dinitz, Koranteng, Kortsarz APPROX '22] to prove a generalized and more complex version of their structure theorem, which we then use to design a recursive approximation algorithm.

34 pages, 4 figures
Fault Trees, Decision Trees, And Binary Decision Diagrams: A Systematic Comparison 2023-10-03
Show

In reliability engineering, we need to understand system dependencies, cause-effect relations, identify critical components, and analyze how they trigger failures. Three prominent graph models commonly used for these purposes are fault trees (FTs), decision trees (DTs), and binary decision diagrams (BDDs). These models are popular because they are easy to interpret, serve as a communication tool between stakeholders of various backgrounds, and support decision-making processes. Moreover, these models help to understand real-world problems by computing reliability metrics, minimum cut sets, logic rules, and displaying dependencies. Nevertheless, it is unclear how these graph models compare. Thus, the goal of this paper is to understand the similarities and differences through a systematic comparison based on their (i) purpose and application, (ii) structural representation, (iii) analysis methods, (iv) construction, and (v) benefits & limitations. Furthermore, we use a running example based on a Container Seal Design to showcase the models in practice. Our results show that, given that FTs, DTs and BDDs have different purposes and application domains, they adopt different structural representations and analysis methodologies that entail a variety of benefits and limitations, the latter can be addressed via conversion methods or extensions. Specific remarks are that BDDs can be considered as a compact representation of binary DTs, since the former allows sub-node sharing, which makes BDDs more efficient at representing logical rules than binary DTs. It is possible to obtain cut sets from BDDs and DTs and construct a FT using the (con/dis)junctive normal form, although this may result in a sub-optimal FT structure.

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model 2023-07-23
Show

In this paper, we show that representations capturing syllabic units emerge when training a self-supervised speech model with a visually-grounded training objective. We demonstrate that a nearly identical model architecture (HuBERT) trained with a masked language modeling loss does not exhibit this same ability, suggesting that the visual grounding objective is responsible for the emergence of this phenomenon. We propose the use of a minimum cut algorithm to automatically predict syllable boundaries in speech, followed by a 2-stage clustering method to group identical syllables together. We show that our model not only outperforms a state-of-the-art syllabic segmentation method on the language it was trained on (English), but also generalizes in a zero-shot fashion to Estonian. Finally, we show that the same model is capable of zero-shot generalization for a word segmentation task on 4 other languages from the Zerospeech Challenge, in some cases beating the previous state-of-the-art.

Inter...

Interspeech 2023. Code & Model: https://github.com/jasonppy/syllable-discovery

Beyond the Quadratic Time Barrier for Network Unreliability 2023-07-20
Show

Karger (STOC 1995) gave the first FPTAS for the network (un)reliability problem, setting in motion research over the next three decades that obtained increasingly faster running times, eventually leading to a $\tilde{O}(n^2)$-time algorithm (Karger, STOC 2020). This represented a natural culmination of this line of work because the algorithmic techniques used can enumerate $Θ(n^2)$ (near)-minimum cuts. In this paper, we go beyond this quadratic barrier and obtain a faster FPTAS for the network unreliability problem. Our algorithm runs in $m^{1+o(1)} + \tilde{O}(n^{1.5})$ time. Our main contribution is a new estimator for network unreliability in very reliable graphs. These graphs are usually the bottleneck for network unreliability since the disconnection event is elusive. Our estimator is obtained by defining an appropriate importance sampling subroutine on a dual spanning tree packing of the graph. To complement this estimator for very reliable graphs, we use recursive contraction for moderately reliable graphs. We show that an interleaving of sparsification and contraction can be used to obtain a better parametrization of the recursive contraction algorithm that yields a faster running time matching the one obtained for the very reliable case.

Minimum Cuts in Geometric Intersection Graphs 2023-05-26
Show

Let $\mathcal{D}$ be a set of $n$ disks in the plane. The disk graph $G_\mathcal{D}$ for $\mathcal{D}$ is the undirected graph with vertex set $\mathcal{D}$ in which two disks are joined by an edge if and only if they intersect. The directed transmission graph $G^{\rightarrow}\mathcal{D}$ for $\mathcal{D}$ is the directed graph with vertex set $\mathcal{D}$ in which there is an edge from a disk $D_1 \in \mathcal{D}$ to a disk $D_2 \in \mathcal{D}$ if and only if $D_1$ contains the center of $D_2$. Given $\mathcal{D}$ and two non-intersecting disks $s, t \in \mathcal{D}$, we show that a minimum $s$-$t$ vertex cut in $G\mathcal{D}$ or in $G^{\rightarrow}\mathcal{D}$ can be found in $O(n^{3/2}\text{polylog} n)$ expected time. To obtain our result, we combine an algorithm for the maximum flow problem in general graphs with dynamic geometric data structures to manipulate the disks. As an application, we consider the barrier resilience problem in a rectangular domain. In this problem, we have a vertical strip $S$ bounded by two vertical lines, $L\ell$ and $L_r$, and a collection $\mathcal{D}$ of disks. Let $a$ be a point in $S$ above all disks of $\mathcal{D}$, and let $b$ a point in $S$ below all disks of $\mathcal{D}$. The task is to find a curve from $a$ to $b$ that lies in $S$ and that intersects as few disks of $\mathcal{D}$ as possible. Using our improved algorithm for minimum cuts in disk graphs, we can solve the barrier resilience problem in $O(n^{3/2}\text{polylog} n)$ expected time.

11 pa...

11 pages, 4 figures; this version corrects a small bug in the proof of Lemma 5. We thank Matej Marinko for pointing this out

An Efficient Algorithm for All-Pairs Bounded Edge Connectivity 2023-05-03
Show

Our work concerns algorithms for an unweighted variant of Maximum Flow. In the All-Pairs Connectivity (APC) problem, we are given a graph $G$ on $n$ vertices and $m$ edges, and are tasked with computing the maximum number of edge-disjoint paths from $s$ to $t$ (equivalently, the size of a minimum $(s,t)$-cut) in $G$, for all pairs of vertices $(s,t)$. Although over undirected graphs APC can be solved in essentially optimal $n^{2+o(1)}$ time, the true time complexity of APC over directed graphs remains open: this problem can be solved in $\tilde{O}(m^ω)$ time, where $ω\in [2, 2.373)$ is the exponent of matrix multiplication, but no matching conditional lower bound is known. We study a variant of APC called the $k$-Bounded All Pairs Connectivity ($k$-APC) problem. In this problem, we are given an integer $k$ and graph $G$, and are tasked with reporting the size of a minimum $(s,t)$-cut only for pairs $(s,t)$ of vertices with a minimum cut size less than $k$ (if the minimum $(s,t)$-cut has size at least $k$, we just report it is "large" instead of computing the exact value). We present an algorithm solving $k$-APC in directed graphs in $\tilde{O}((kn)^ω)$ time. This runtime is $\tilde O(n^ω)$ for all $k$ polylogarithmic in $n$, which is essentially optimal under popular conjectures from fine-grained complexity. Previously, this runtime was only known for $k\le 2$ [Georgiadis et al., ICALP 2017]. We also study a variant of $k$-APC, the $k$-Bounded All-Pairs Vertex Connectivity ($k$-APVC) problem, which considers internally vertex-disjoint paths instead of edge-disjoint paths. We present an algorithm solving $k$-APVC in directed graphs in $\tilde{O}(k^2n^ω)$ time. Previous work solved an easier version of the $k$-APVC problem in $\tilde O((kn)^ω)$ time [Abboud et al, ICALP 2019].

Total Variation Graph Neural Networks 2023-04-27
Show

Recently proposed Graph Neural Networks (GNNs) for vertex clustering are trained with an unsupervised minimum cut objective, approximated by a Spectral Clustering (SC) relaxation. However, the SC relaxation is loose and, while it offers a closed-form solution, it also yields overly smooth cluster assignments that poorly separate the vertices. In this paper, we propose a GNN model that computes cluster assignments by optimizing a tighter relaxation of the minimum cut based on graph total variation (GTV). The cluster assignments can be used directly to perform vertex clustering or to implement graph pooling in a graph classification framework. Our model consists of two core components: i) a message-passing layer that minimizes the $\ell_1$ distance in the features of adjacent vertices, which is key to achieving sharp transitions between clusters; ii) an unsupervised loss function that minimizes the GTV of the cluster assignments while ensuring balanced partitions. Experimental results show that our model outperforms other GNNs for vertex clustering and graph classification.

The Energy Complexity of Diameter and Minimum Cut Computation in Bounded-Genus Networks 2023-04-10
Show

This paper investigates the energy complexity of distributed graph problems in multi-hop radio networks, where the energy cost of an algorithm is measured by the maximum number of awake rounds of a vertex. Recent works revealed that some problems, such as broadcast, breadth-first search, and maximal matching, can be solved with energy-efficient algorithms that consume only $\text{poly} \log n$ energy. However, there exist some problems, such as computing the diameter of the graph, that require $Ω(n)$ energy to solve. To improve energy efficiency for these problems, we focus on a special graph class: bounded-genus graphs. We present algorithms for computing the exact diameter, the exact global minimum cut size, and a $(1 \pmε)$-approximate $s$-$t$ minimum cut size with $\tilde{O}(\sqrt{n})$ energy for bounded-genus graphs. Our approach is based on a generic framework that divides the vertex set into high-degree and low-degree parts and leverages the structural properties of bounded-genus graphs to control the number of certain connected components in the subgraph induced by the low-degree part.

Remov...

Removing results that were already moved to arXiv:2007.09816. Polishing the writing. Changing the title. To appear in SIROCCO 2023

Massively Parallel Computation in a Heterogeneous Regime 2023-02-28
Show

Massively-parallel graph algorithms have received extensive attention over the past decade, with research focusing on three memory regimes: the superlinear regime, the near-linear regime, and the sublinear regime. The sublinear regime is the most desirable in practice, but conditional hardness results point towards its limitations. In this work we study a \emph{heterogeneous} model, where the memory of the machines varies in size. We focus mostly on the heterogeneous setting created by adding a single near-linear machine to the sublinear MPC regime, and show that even a single large machine suffices to circumvent most of the conditional hardness results for the sublinear regime: for graphs with $n$ vertices and $m$ edges, we give (a) an MST algorithm that runs in $O(\log\log(m/n))$ rounds; (b) an algorithm that constructs an $O(k)$-spanner of size $O(n^{1+1/k})$ in $O(1)$ rounds; and (c) a maximal-matching algorithm that runs in $O(\sqrt{\log(m/n)}\log\log(m/n))$ rounds. We also observe that the best known near-linear MPC algorithms for several other graph problems which are conjectured to be hard in the sublinear regime (minimum cut, maximal independent set, and vertex coloring) can easily be transformed to work in the heterogeneous MPC model with a single near-linear machine, while retaining their original round complexity in the near-linear regime. If the large machine is allowed to have \emph{superlinear} memory, all of the problems above can be solved in $O(1)$ rounds.

Appeared in PODC2022
TetCNN: Convolutional Neural Networks on Tetrahedral Meshes 2023-02-14
Show

Convolutional neural networks (CNN) have been broadly studied on images, videos, graphs, and triangular meshes. However, it has seldom been studied on tetrahedral meshes. Given the merits of using volumetric meshes in applications like brain image analysis, we introduce a novel interpretable graph CNN framework for the tetrahedral mesh structure. Inspired by ChebyNet, our model exploits the volumetric Laplace-Beltrami Operator (LBO) to define filters over commonly used graph Laplacian which lacks the Riemannian metric information of 3D manifolds. For pooling adaptation, we introduce new objective functions for localized minimum cuts in the Graclus algorithm based on the LBO. We employ a piece-wise constant approximation scheme that uses the clustering assignment matrix to estimate the LBO on sampled meshes after each pooling. Finally, adapting the Gradient-weighted Class Activation Mapping algorithm for tetrahedral meshes, we use the obtained heatmaps to visualize discovered regions-of-interest as biomarkers. We demonstrate the effectiveness of our model on cortical tetrahedral meshes from patients with Alzheimer's disease, as there is scientific evidence showing the correlation of cortical thickness to neurodegenerative disease progression. Our results show the superiority of our LBO-based convolution layer and adapted pooling over the conventionally used unitary cortical thickness, graph Laplacian, and point cloud representation.

Accep...

Accepted as a conference paper to Information Processing in Medical Imaging (IPMI 2023) conference

Approximate minimum cuts and their enumeration 2022-11-30
Show

We show that every $α$-approximate minimum cut in a connected graph is the unique minimum $(S,T)$-terminal cut for some subsets $S$ and $T$ of vertices each of size at most $\lfloor2α\rfloor+1$. This leads to an alternative proof that the number of $α$-approximate minimum cuts in a $n$-vertex connected graph is $n^{O(α)}$ and they can all be enumerated in deterministic polynomial time for constant $α$.

Accepted to SOSA'23
A Nearly Time-Optimal Distributed Approximation of Minimum Cost $k$-Edge-Connected Spanning Subgraph 2022-11-09
Show

The minimum-cost $k$-edge-connected spanning subgraph ($k$-ECSS) problem is a generalization and strengthening of the well-studied minimum-cost spanning tree (MST) problem. While the round complexity of distributedly computing the latter has been well-understood, the former remains mostly open, especially as soon as $k\geq 3$. In this paper, we present the first distributed algorithm that computes an approximation of $k$-ECSS in sublinear time for general $k$. Concretely, we describe a randomized distributed algorithm that, in $\tilde{O}(k(D+k\sqrt{n}))$ rounds, computes a $k$-edge-connected spanning subgraph whose cost is within an $O(\log n\log k)$ factor of optimal. Here, $n$ and $D$ denote the number of vertices and diameter of the graph, respectively. This time complexity is nearly optimal for any $k=poly(\log n)$, almost matching an $\tildeΩ(D+\sqrt{n/k})$ lower bound. Our algorithm is the first to achieve a sublinear round complexity for $k\geq 3$. We note that this case is considerably more challenging than the well-studied and well-understood $k=1$ case -- better known as MST -- and the closely related $k=2$ case. Our algorithm is based on reducing the $k$-ECSS problem to $k$ set cover instances, in which we gradually augment the connectivity of the spanning subgraph. To solve each set cover instance, we combine new structural observations on minimum cuts with graph sketching ideas. One key ingredient in our algorithm is a novel structural lemma that allows us to compress the information about all minimum cuts in a graph into a succinct representation, which is computed in a decentralized fashion. We hope that this succinct representation may find applications in other computational settings or for other problems.

SODA 2023
A Simple Differentially Private Algorithm for Global Minimum Cut 2022-08-22
Show

In this note, we present a simple differentially private algorithm for the global minimum cut problem using only one call to the exponential mechanism. This problem was first studied by Gupta et al. [2010], and they gave a differentially private algorithm with near-optimal utility guarantees. We improve upon their work in many aspects: our algorithm is simpler, more natural, and more efficient than the one given in Gupta et al. [2010], and furthermore provides slightly better privacy and utility guarantees.

There...

There is an error in the privacy argument. The algorithm only outputs t such that the minimum s-t cut (S_t,V-S_t) gives an O(log n/eps) approximation. There is currently no way to privately compute min s-t cut, so this doesn't do anything

Vertex Sparsifiers for Hyperedge Connectivity 2022-07-12
Show

Recently, Chalermsook et al. [SODA'21(arXiv:2007.07862)] introduces a notion of vertex sparsifiers for $c$-edge connectivity, which has found applications in parameterized algorithms for network design and also led to exciting dynamic algorithms for $c$-edge st-connectivity [Jin and Sun FOCS'21(arXiv:2004.07650)]. We study a natural extension called vertex sparsifiers for $c$-hyperedge connectivity and construct a sparsifier whose size matches the state-of-the-art for normal graphs. More specifically, we show that, given a hypergraph $G=(V,E)$ with $n$ vertices and $m$ hyperedges with $k$ terminal vertices and a parameter $c$, there exists a hypergraph $H$ containing only $O(kc^{3})$ hyperedges that preserves all minimum cuts (up to value $c$) between all subset of terminals. This matches the best bound of $O(kc^{3})$ edges for normal graphs by [Liu'20(arXiv:2011.15101)]. Moreover, $H$ can be constructed in almost-linear $O(p^{1+o(1)} + n(rc\log n)^{O(rc)}\log m)$ time where $r=\max_{e\in E}

e
Universally-Optimal Distributed Exact Min-Cut 2022-05-31
Show

We present a universally-optimal distributed algorithm for the exact weighted min-cut. The algorithm is guaranteed to complete in $\widetilde{O}(D + \sqrt{n})$ rounds on every graph, recovering the recent result of Dory, Efron, Mukhopadhyay, and Nanongkai~[STOC'21], but runs much faster on structured graphs. Specifically, the algorithm completes in $\widetilde{O}(D)$ rounds on (weighted) planar graphs or, more generally, any (weighted) excluded-minor family. We obtain this result by designing an aggregation-based algorithm: each node receives only an aggregate of the messages sent to it. While somewhat restrictive, recent work shows any such black-box algorithm can be simulated on any minor of the communication network. Furthermore, we observe this also allows for the addition of (a small number of) arbitrarily-connected virtual nodes to the network. We leverage these capabilities to design a min-cut algorithm that is significantly simpler compared to prior distributed work. We hope this paper showcases how working within this paradigm yields simple-to-design and ultra-efficient distributed algorithms for global problems. Our main technical contribution is a distributed algorithm that, given any tree $T$, computes the minimum cut that $2$-respects $T$ (i.e., cuts at most $2$ edges of $T$) in universally near-optimal time. Moreover, our algorithm gives a \emph{deterministic} $\widetilde{O}(D)$-round 2-respecting cut solution for excluded-minor families and a \emph{deterministic} $\widetilde{O}(D + \sqrt{n})$-round solution for general graphs, the latter resolving a question of Dory, et al.~[STOC'21]

34 pa...

34 pages, accepted to PODC 2022

QB-II for Evaluating the Reliability of Binary-State Networks 2022-05-30
Show

Current real-life applications of various networks such as utility (gas, water, electric, 4G/5G) networks, the Internet of Things, social networks, and supply chains. Reliability is one of the most popular tools for evaluating network performance. The fundamental structure of these networks is a binary state network. Distinctive methods have been proposed to efficiently assess binary-state network reliability. A new algorithm called QB-II (quick binary-addition tree algorithm II) is proposed to improve the efficiency of quick BAT, which is based on BAT and outperforms many algorithms. The proposed QB-II implements the shortest minimum cuts (MCs) to separate the entire BAT into main-BAT and sub-BATs, and the source-target matrix convolution products to connect these subgraphs intelligently to improve the efficiency. Twenty benchmark problems were used to validate the performance of the QB-II.

Deterministic Min-cut in Poly-logarithmic Max-flows 2022-05-28
Show

We give a deterministic algorithm for finding the minimum (weight) cut of an undirected graph on $n$ vertices and $m$ edges using $\text{polylog}(n)$ calls to any maximum flow subroutine. Using the current best deterministic maximum flow algorithms, this yields an overall running time of $\tilde O(m \cdot \min(\sqrt{m}, n^{2/3}))$ for weighted graphs, and $m^{4/3+o(1)}$ for unweighted (multi)-graphs. This marks the first improvement for this problem since a running time bound of $\tilde O(mn)$ was established by several papers in the early 1990s. Our global minimum cut algorithm is obtained as a corollary of a minimum Steiner cut algorithm, where a minimum Steiner cut is a minimum (weight) set of edges whose removal disconnects at least one pair of vertices among a designated set of terminal vertices. The running time of our deterministic minimum Steiner cut algorithm matches that of the global minimum cut algorithm stated above. Using randomization, the running time improves to $m^{1+o(1)}$ because of a faster maximum flow subroutine; this improves the best known randomized algorithm for the minimum Steiner cut problem as well. Our main technical contribution is a new tool that we call isolating cuts. Given a set of vertices $R$, this entails finding cuts of minimum weight that separate (or isolate) each individual vertex $v\in R$ from the rest of the vertices $R\setminus {v}$. Naïvely, this can be done using $

R
Review of Serial and Parallel Min-Cut/Max-Flow Algorithms for Computer Vision 2022-04-20
Show

Minimum cut/maximum flow (min-cut/max-flow) algorithms solve a variety of problems in computer vision and thus significant effort has been put into developing fast min-cut/max-flow algorithms. As a result, it is difficult to choose an ideal algorithm for a given problem. Furthermore, parallel algorithms have not been thoroughly compared. In this paper, we evaluate the state-of-the-art serial and parallel min-cut/max-flow algorithms on the largest set of computer vision problems yet. We focus on generic algorithms, i.e., for unstructured graphs, but also compare with the specialized GridCut implementation. When applicable, GridCut performs best. Otherwise, the two pseudoflow algorithms, Hochbaum pseudoflow and excesses incremental breadth first search, achieves the overall best performance. The most memory efficient implementation tested is the Boykov-Kolmogorov algorithm. Amongst generic parallel algorithms, we find the bottom-up merging approach by Liu and Sun to be best, but no method is dominant. Of the generic parallel methods, only the parallel preflow push-relabel algorithm is able to efficiently scale with many processors across problem sizes, and no generic parallel method consistently outperforms serial algorithms. Finally, we provide and evaluate strategies for algorithm selection to obtain good expected performance. We make our dataset and implementations publicly available for further research.

20 pa...

20 pages, 13 figures, accepted for publication at T-PAMI

A Framework to Design Approximation Algorithms for Finding Diverse Solutions in Combinatorial Problems 2022-01-22
Show

Finding a \emph{single} best solution is the most common objective in combinatorial optimization problems. However, such a single solution may not be applicable to real-world problems as objective functions and constraints are only "approximately" formulated for original real-world problems. To solve this issue, finding \emph{multiple} solutions is a natural direction, and diversity of solutions is an important concept in this context. Unfortunately, finding diverse solutions is much harder than finding a single solution. To cope with difficulty, we investigate the approximability of finding diverse solutions. As a main result, we propose a framework to design approximation algorithms for finding diverse solutions, which yields several outcomes including constant-factor approximation algorithms for finding diverse matchings in graphs and diverse common bases in two matroids and PTASes for finding diverse minimum cuts and interval schedulings.

Cut query algorithms with star contraction 2022-01-14
Show

We study the complexity of determining the edge connectivity of a simple graph with cut queries. We show that (i) there is a bounded-error randomized algorithm that computes edge connectivity with $O(n)$ cut queries, and (ii) there is a bounded-error quantum algorithm that computes edge connectivity with $Õ(\sqrt{n})$ cut queries. We prove these results using a new technique called "star contraction" to randomly contract edges of a graph while preserving non-trivial minimum cuts. In star contraction vertices randomly contract an edge incident on a small set of randomly chosen vertices. In contrast to the related 2-out contraction technique of Ghaffari, Nowicki, and Thorup [SODA'20], star contraction only contracts vertex-disjoint star subgraphs, which allows it to be efficiently implemented via cut queries. The $O(n)$ bound from item (i) was not known even for the simpler problem of connectivity, and improves the $O(n\log^3 n)$ bound by Rubinstein, Schramm, and Weinberg [ITCS'18]. The bound is tight under the reasonable conjecture that the randomized communication complexity of connectivity is $Ω(n\log n)$, an open question since the seminal work of Babai, Frankl, and Simon [FOCS'86]. The bound also excludes using edge connectivity on simple graphs to prove a superlinear randomized query lower bound for minimizing a symmetric submodular function. Item (ii) gives a nearly-quadratic separation with the randomized complexity and addresses an open question of Lee, Santha, and Zhang [SODA'21]. The algorithm can also be viewed as making $Õ(\sqrt{n})$ matrix-vector multiplication queries to the adjacency matrix. Finally, we demonstrate the use of star contraction outside of the cut query setting by designing a one-pass semi-streaming algorithm for computing edge connectivity in the vertex arrival setting. This contrasts with the edge arrival setting where two passes are required.

Parallel Minimum Cuts in $O(m \log^2(n))$ Work and Low Depth 2021-12-28
Show

We present a randomized $O(m \log^2 n)$ work, $O(\text{polylog } n)$ depth parallel algorithm for minimum cut. This algorithm matches the work bounds of a recent sequential algorithm by Gawrychowski, Mozes, and Weimann [ICALP'20], and improves on the previously best parallel algorithm by Geissmann and Gianinazzi [SPAA'18], which performs $O(m \log^4 n)$ work in $O(\text{polylog } n)$ depth. Our algorithm makes use of three components that might be of independent interest. Firstly, we design a parallel data structure that efficiently supports batched mixed queries and updates on trees. It generalizes and improves the work bounds of a previous data structure of Geissmann and Gianinazzi and is work efficient with respect to the best sequential algorithm. Secondly, we design a parallel algorithm for approximate minimum cut that improves on previous results by Karger and Motwani. We use this algorithm to give a work-efficient procedure to produce a tree packing, as in Karger's sequential algorithm for minimum cuts. Lastly, we design an efficient parallel algorithm for solving the minimum $2$-respecting cut problem.

This ...

This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2021

Extensions of Karger's Algorithm: Why They Fail in Theory and How They Are Useful in Practice 2021-12-16
Show

The minimum graph cut and minimum $s$-$t$-cut problems are important primitives in the modeling of combinatorial problems in computer science, including in computer vision and machine learning. Some of the most efficient algorithms for finding global minimum cuts are randomized algorithms based on Karger's groundbreaking contraction algorithm. Here, we study whether Karger's algorithm can be successfully generalized to other cut problems. We first prove that a wide class of natural generalizations of Karger's algorithm cannot efficiently solve the $s$-$t$-mincut or the normalized cut problem to optimality. However, we then present a simple new algorithm for seeded segmentation / graph-based semi-supervised learning that is closely based on Karger's original algorithm, showing that for these problems, extensions of Karger's algorithm can be useful. The new algorithm has linear asymptotic runtime and yields a potential that can be interpreted as the posterior probability of a sample belonging to a given seed / class. We clarify its relation to the random walker algorithm / harmonic energy minimization in terms of distributions over spanning forests. On classical problems from seeded image segmentation and graph-based semi-supervised learning on image data, the method performs at least as well as the random walker / harmonic energy minimization / Gaussian processes.

Oral ...

Oral at ICCV 2021; added acknowledgements

Approximation algorithms for confidence bands for time series 2021-12-12
Show

Confidence intervals are a standard technique for analyzing data. When applied to time series, confidence intervals are computed for each time point separately. Alternatively, we can compute confidence bands, where we are required to find the smallest area enveloping $k$ time series, where $k$ is a user parameter. Confidence bands can be then used to detect abnormal time series, not just individual observations within the time series. We will show that despite being an NP-hard problem it is possible to find optimal confidence band for some $k$. We do this by considering a different problem: discovering regularized bands, where we minimize the envelope area minus the number of included time series weighted by a parameter $α$. Unlike normal confidence bands we can solve the problem exactly by using a minimum cut. By varying $α$ we can obtain solutions for various $k$. If we have a constraint $k$ for which we cannot find appropriate $α$, we demonstrate a simple algorithm that yields $O(\sqrt{n})$ approximation guarantee by connecting the problem to a minimum $k$-union problem. This connection also implies that we cannot approximate the problem better than $O(n^{1/4})$ under some (mild) assumptions. Finally, we consider a variant where instead of minimizing the area we minimize the maximum width. Here, we demonstrate a simple 2-approximation algorithm and show that we cannot achieve better approximation guarantee.

Gomory-Hu Trees in Quadratic Time 2021-12-02
Show

Gomory-Hu tree [Gomory and Hu, 1961] is a succinct representation of pairwise minimum cuts in an undirected graph. When the input graph has general edge weights, classic algorithms need at least cubic running time to compute a Gomory-Hu tree. Very recently, the authors of [AKL+, arXiv v1, 2021] have improved the running time to $\tilde{O}(n^{2.875})$ which breaks the cubic barrier for the first time. In this paper, we refine their approach and improve the running time to $\tilde{O}(n^2)$. This quadratic upper bound is also obtained independently in an updated version by the same group of authors [AKL+, arXiv v2, 2021].

On the Robustness of Distributed Computing Networks 2021-11-26
Show

Traffic flows in a distributed computing network require both transmission and processing, and can be interdicted by removing either communication or computation resources. We study the robustness of a distributed computing network under the failures of communication links and computation nodes. We define cut metrics that measure the connectivity, and show a non-zero gap between the maximum flow and the minimum cut. Moreover, we study a network flow interdiction problem that minimizes the maximum flow by removing communication and computation resources within a given budget. We develop mathematical programs to compute the optimal interdiction, and polynomial-time approximation algorithms that achieve near-optimal interdiction in simulation.

Inter...

International Conference on the Design of Reliable Communication Networks (DRCN)

About

Daily ArXiv Papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%