| Pratyush Uppuluri |
Multi-Agent Systems |
—This work presents a consensus-based Bayesian framework to detect malicious user behavior in enterprise directory access graphs. By modeling directories as topics and users as agents within a multi-level interaction graph, we simulate access evolution using influence-weighted opinion dynamics. Logical dependencies between users are encoded in dynamic matrices Ci, and directory similarity is captured via a shared influence matrix W . Malicious behavior is injected as cross-component logical perturbations that violate structural norms of strongly connected components (SCCs). We apply theoretical guarantees from opinion dynamics literature to determine topic convergence and detect anomaly via scaled opinion variance. To quantify uncertainty, we introduce a Bayesian anomaly scoring mechanism that evolves over time, using both static and online priors. Simulations over synthetic access graphs validate our method, demonstrating its sensitivity to logical inconsistencies and robustness under dynamic perturbation.
More...
—This work presents a consensus-based Bayesian framework to detect malicious user behavior in enterprise directory access graphs. By modeling directories as topics and users as agents within a multi-level interaction graph, we simulate access evolution using influence-weighted opinion dynamics. Logical dependencies between users are encoded in dynamic matrices Ci, and directory similarity is captured via a shared influence matrix W . Malicious behavior is injected as cross-component logical perturbations that violate structural norms of strongly connected components (SCCs). We apply theoretical guarantees from opinion dynamics literature to determine topic convergence and detect anomaly via scaled opinion variance. To quantify uncertainty, we introduce a Bayesian anomaly scoring mechanism that evolves over time, using both static and online priors. Simulations over synthetic access graphs validate our method, demonstrating its sensitivity to logical inconsistencies and robustness under dynamic perturbation.
|
| Mohamad Louai Shehab |
Learning Reward Machines from Partially Observed Policies |
Inverse reinforcement learning is the problem of inferring a reward function from an optimal policy or demonstrations by an expert. In this work, it is assumed that the reward is expressed as a reward machine whose transitions depend on atomic propositions associated with the state of a Markov Decision Process (MDP). Our goal is to identify the true reward machine using finite information. To this end, we first introduce the notion of a prefix tree policy which associates a distribution of actions to each state of the MDP and each attainable finite sequence of atomic propositions. Then, we characterize an equivalence class of reward machines that can be identified given the prefix tree policy. Finally, we propose a SAT-based algorithm that uses information extracted from the prefix tree policy to solve for a reward machine. It is proved that if the prefix tree policy is known up to a sufficient (but finite) depth, our algorithm recovers the exact reward machine up to the equivalence class. This sufficient depth is derived as a function of the number of MDP states and (an upper bound on) the number of states of the reward machine. These results are further extended to the case where we only have access to demonstrations from an optimal policy. Several examples, including discrete grid and block worlds, a continuous state-space robotic arm, and real data from experiments with mice, are used to demonstrate the effectiveness and generality of the approach.
More...
Inverse reinforcement learning is the problem of inferring a reward function from an optimal policy or demonstrations by an expert. In this work, it is assumed that the reward is expressed as a reward machine whose transitions depend on atomic propositions associated with the state of a Markov Decision Process (MDP). Our goal is to identify the true reward machine using finite information. To this end, we first introduce the notion of a prefix tree policy which associates a distribution of actions to each state of the MDP and each attainable finite sequence of atomic propositions. Then, we characterize an equivalence class of reward machines that can be identified given the prefix tree policy. Finally, we propose a SAT-based algorithm that uses information extracted from the prefix tree policy to solve for a reward machine. It is proved that if the prefix tree policy is known up to a sufficient (but finite) depth, our algorithm recovers the exact reward machine up to the equivalence class. This sufficient depth is derived as a function of the number of MDP states and (an upper bound on) the number of states of the reward machine. These results are further extended to the case where we only have access to demonstrations from an optimal policy. Several examples, including discrete grid and block worlds, a continuous state-space robotic arm, and real data from experiments with mice, are used to demonstrate the effectiveness and generality of the approach.
|
| Oliver Shindell |
Robotic and Autonomous Systems |
This paper presents a novel system for flexible automated fabrication of microrobots with embedded permanent magnets, and for the loading of liquid therapeutic drugs and sealing with thermally sensitive wax. Microrobots featuring embedded magnets are more controllable and observable, and are capable of tasks requiring higher forces. In this system, a micromanipulator controls tweezers, and stepper motors actuate a four-stage system that executes different assembly steps. A syringe pump is used to fill drug delivery microrobots, and a wax seal is applied with a brush made from heated copper wires. This brush is capable of efficiently applying an even wax coating to drug delivery robots, sealing the contained therapeutics inside. Vision-based feedback from an overhead microscope camera ensures precise embedded magnet assembly through a combination of image processing algorithms. A single robot takes approximately 192 seconds to assemble, plus 119 seconds for additional embedded magnets beyond the first, corresponding to 45.7% and 51.8% of the time required by a trained human. Drug loading and sealing takes around 165 seconds, 36.7% of the manual operation time, and offers a significant improvement to seal consistency and control of the thickness and area of application. This work advances micro-assembly toward practical medical use by establishing a practical basis for mass production.
More...
This paper presents a novel system for flexible automated fabrication of microrobots with embedded permanent magnets, and for the loading of liquid therapeutic drugs and sealing with thermally sensitive wax. Microrobots featuring embedded magnets are more controllable and observable, and are capable of tasks requiring higher forces. In this system, a micromanipulator controls tweezers, and stepper motors actuate a four-stage system that executes different assembly steps. A syringe pump is used to fill drug delivery microrobots, and a wax seal is applied with a brush made from heated copper wires. This brush is capable of efficiently applying an even wax coating to drug delivery robots, sealing the contained therapeutics inside. Vision-based feedback from an overhead microscope camera ensures precise embedded magnet assembly through a combination of image processing algorithms. A single robot takes approximately 192 seconds to assemble, plus 119 seconds for additional embedded magnets beyond the first, corresponding to 45.7% and 51.8% of the time required by a trained human. Drug loading and sealing takes around 165 seconds, 36.7% of the manual operation time, and offers a significant improvement to seal consistency and control of the thickness and area of application. This work advances micro-assembly toward practical medical use by establishing a practical basis for mass production.
|
| Maheed Ahmed |
Online Laplacian-Based Representation Learning in Reinforcement Learning |
Representation learning plays a crucial role in reinforcement learning, especially in complex environments with high-dimensional and unstructured states. Effective representations can enhance the efficiency of learning algorithms by improving sample efficiency and generalization across tasks. This paper considers the Laplacian-based framework for representation learning, where the eigenvectors of the Laplacian matrix of the underlying transition graph are leveraged to encode meaningful features from raw sensory observations of the states. Despite the promising algorithmic advances in this framework, it remains an open question whether the Laplacian-based representations can be learned online and with theoretical guarantees along with policy learning. We address this by formulating an online optimization approach using the Asymmetric Graph Drawing Objective (AGDO) and analyzing its convergence via online projected gradient descent under mild assumptions. Our extensive simulation studies empirically validate the convergence guarantees to the true Laplacian representation. Furthermore, we provide insights into the compatibility of different reinforcement learning algorithms with online representation learning.
More...
Representation learning plays a crucial role in reinforcement learning, especially in complex environments with high-dimensional and unstructured states. Effective representations can enhance the efficiency of learning algorithms by improving sample efficiency and generalization across tasks. This paper considers the Laplacian-based framework for representation learning, where the eigenvectors of the Laplacian matrix of the underlying transition graph are leveraged to encode meaningful features from raw sensory observations of the states. Despite the promising algorithmic advances in this framework, it remains an open question whether the Laplacian-based representations can be learned online and with theoretical guarantees along with policy learning. We address this by formulating an online optimization approach using the Asymmetric Graph Drawing Objective (AGDO) and analyzing its convergence via online projected gradient descent under mild assumptions. Our extensive simulation studies empirically validate the convergence guarantees to the true Laplacian representation. Furthermore, we provide insights into the compatibility of different reinforcement learning algorithms with online representation learning.
|
| Hainan Wang |
Gaussian Process Surrogates for Robust and Stochastic Optimization in Chemical Systems |
Gaussian Process (GP) regression provides a flexible, nonparametric framework for surrogate modeling that naturally quantifies model-form and prediction uncertainty. As summarized in Rasmussen’s foundational tutorial on GP models [1], GPs enable smooth interpolation, closed-form uncertainty estimates, and analytical gradient properties that make them well-suited for optimization under uncertainty. Inspired by the Bayesian calibration philosophy of Kennedy and O’Hagan [2] and more recent developments on Bayesian hybrid modeling for optimization under epistemic uncertainty [3,4,5], this work develops a unified study of embedding GP surrogates into deterministic, robust, and stochastic optimization formulations within Pyomo. Benchmark chemical-process case studies demonstrate that GP surrogates retain predictive accuracy while enabling efficient uncertainty propagation. Deterministic embeddings provide rapid solutions but may underestimate risk. Robust and stochastic formulations substantially improve reliability. Unlike classical hybrid models, where GPs correct mechanistic-model discrepancy [3], the present framework treats the GP as an independent predictive layer integrated directly with Pyomo’s nonlinear and stochastic programming interfaces. This enables model modularity, solver compatibility (IPOPT/Gurobi), and seamless switching between data-driven and physics-informed components. Overall, the study provides a unified and practical pathway for incorporating GP surrogates into decision-making workflows in chemical systems. The framework supports applications in process design, digital twins, and operations under uncertainty. Future work will extend the approach to multi-output and heteroscedastic GPs to further improve reliability in data-limited environments.
More...
Gaussian Process (GP) regression provides a flexible, nonparametric framework for surrogate modeling that naturally quantifies model-form and prediction uncertainty. As summarized in Rasmussen’s foundational tutorial on GP models [1], GPs enable smooth interpolation, closed-form uncertainty estimates, and analytical gradient properties that make them well-suited for optimization under uncertainty. Inspired by the Bayesian calibration philosophy of Kennedy and O’Hagan [2] and more recent developments on Bayesian hybrid modeling for optimization under epistemic uncertainty [3,4,5], this work develops a unified study of embedding GP surrogates into deterministic, robust, and stochastic optimization formulations within Pyomo.
Benchmark chemical-process case studies demonstrate that GP surrogates retain predictive accuracy while enabling efficient uncertainty propagation. Deterministic embeddings provide rapid solutions but may underestimate risk. Robust and stochastic formulations substantially improve reliability. Unlike classical hybrid models, where GPs correct mechanistic-model discrepancy [3], the present framework treats the GP as an independent predictive layer integrated directly with Pyomo’s nonlinear and stochastic programming interfaces. This enables model modularity, solver compatibility (IPOPT/Gurobi), and seamless switching between data-driven and physics-informed components.
Overall, the study provides a unified and practical pathway for incorporating GP surrogates into decision-making workflows in chemical systems. The framework supports applications in process design, digital twins, and operations under uncertainty. Future work will extend the approach to multi-output and heteroscedastic GPs to further improve reliability in data-limited environments.
Keywords: Gaussian Process Regression; Surrogate Modeling; Robust Optimization; Stochastic Programming; Uncertainty Quantification.
Acknowledgements: This research is supported by the U.S. National Science Foundation (NSF) with grant number (NSF Award CBET-) 1941596.
|
| Keshav Kasturi Rangan |
Dynamic Modeling and Intrusive Uncertainty Quantification for Membrane Separation Systems |
Membrane separation processes play an essential role in water treatment, chemical manufacturing, and the recovery of critical materials. However, the design and interpretation of these systems are often hindered by uncertainty in transport models and their associated parameters. This work develops a framework for dynamic modeling and intrusive uncertainty quantification (UQ) to characterize neutral and charged nanofiltration membranes using data from dynamic diafiltration experiments. The approach integrates physics-based modeling and sensitivity analysis within an equation-oriented computational environment. The framework will employ ParmEst and Pyomo.DoE within the Pyomo ecosystem to perform parameter estimation and experiment design. These intrusive methods directly use the model structure (e.g., derivatives via symbolic/automatic differentiation) for parameter estimation and model discrimination. A weighted sum of squared errors formulation supports robust parameter estimation across heterogeneous datasets. At the same time, Fisher Information Matrix criteria (A-, D-, E-, and ME-optimality) are proposed to guide the selection of operating conditions that maximize information content. These tools facilitate the evaluation of alternative transport models and support the quantitative comparison of multi-ion transport mechanisms. Experimentally, a dynamic diafiltration apparatus enables automated adjustment of operating conditions, allowing iterative refinement of parameters and experiment designs. This coupling reflects principles central to control and optimization: feedback-driven selection of experimental inputs, sensitivity-guided operation, and efficient allocation of experimental effort. The workflow demonstrates how integrating intrusive UQ with MBDoE strengthens the predictive use of mechanistic transport models while reducing uncertainty in parameter estimates. Overall, the framework links transport modeling with techniques from optimization and automation to support uncertainty-aware membrane characterization. The approach provides a structured basis for adaptive experimentation and contributes to systematic analysis of membrane systems relevant to water treatment, energy, and materials recovery.
More...
Membrane separation processes play an essential role in water treatment, chemical manufacturing, and the recovery of critical materials. However, the design and interpretation of these systems are often hindered by uncertainty in transport models and their associated parameters. This work develops a framework for dynamic modeling and intrusive uncertainty quantification (UQ) to characterize neutral and charged nanofiltration membranes using data from dynamic diafiltration experiments. The approach integrates physics-based modeling and sensitivity analysis within an equation-oriented computational environment.
The framework will employ ParmEst and Pyomo.DoE within the Pyomo ecosystem to perform parameter estimation and experiment design. These intrusive methods directly use the model structure (e.g., derivatives via symbolic/automatic differentiation) for parameter estimation and model discrimination. A weighted sum of squared errors formulation supports robust parameter estimation across heterogeneous datasets. At the same time, Fisher Information Matrix criteria (A-, D-, E-, and ME-optimality) are proposed to guide the selection of operating conditions that maximize information content. These tools facilitate the evaluation of alternative transport models and support the quantitative comparison of multi-ion transport mechanisms.
Experimentally, a dynamic diafiltration apparatus enables automated adjustment of operating conditions, allowing iterative refinement of parameters and experiment designs. This coupling reflects principles central to control and optimization: feedback-driven selection of experimental inputs, sensitivity-guided operation, and efficient allocation of experimental effort. The workflow demonstrates how integrating intrusive UQ with MBDoE strengthens the predictive use of mechanistic transport models while reducing uncertainty in parameter estimates. Overall, the framework links transport modeling with techniques from optimization and automation to support uncertainty-aware membrane characterization. The approach provides a structured basis for adaptive experimentation and contributes to systematic analysis of membrane systems relevant to water treatment, energy, and materials recovery.
|
| Joonwon Choi |
Koopman Operator-based Network Modularization for Policy Reuse |
In this project, we propose an algorithm for Koopman operator-based neural network modularization of a pre-trained network and its application to policy reuse. Task modularization for a neural network (NN) decomposes a pretrained network into subnetworks and adapts it for the target task. It has been widely studied due to its benefits in improving interpretability and performance on the target task without fine-tuning the network’s parameters. Nevertheless, most existing techniques rely on heuristics without providing a theoretical foundation for the underlying mechanism. To address this issue, we first propose a decomposition algorithm based on the Koopman operator theory to decompose a pre-trained network into several subnetworks. Considering NN as an autonomous system, we develop Padded Extended Dynamic Mode Decomposition (PEDMD) to approximate an arbitrary NN as a Koopman operator. The Koopman operator computed from PEDMD is then decomposed by applying the Koopman Mode Decomposition (KMD), where each mode represents a distinct subnetwork. Thus, the complex correlation between subnetworks can be represented as a linear operation between Koopman mode-eigenfunction pairs in the Koopman observable space. Furthermore, the original network can be easily reconstructed as a linear combination of subnetworks.
More...
In this project, we propose an algorithm for Koopman operator-based neural network modularization of a pre-trained network and its application to policy reuse. Task modularization for a neural network (NN) decomposes a pretrained network into subnetworks and adapts it for the target task. It has been widely studied due to its benefits in improving interpretability and performance on the target task without fine-tuning the network’s parameters. Nevertheless, most existing techniques rely on heuristics without providing a theoretical foundation for the underlying mechanism.
To address this issue, we first propose a decomposition algorithm based on the Koopman operator theory to decompose a pre-trained network into several subnetworks. Considering NN as an autonomous system, we develop Padded Extended Dynamic Mode Decomposition (PEDMD) to approximate an arbitrary NN as a Koopman operator. The Koopman operator computed from PEDMD is then decomposed by applying the Koopman Mode Decomposition (KMD), where each mode represents a distinct subnetwork. Thus, the complex correlation between subnetworks can be represented as a linear operation between Koopman mode-eigenfunction pairs in the Koopman observable space. Furthermore, the original network can be easily reconstructed as a linear combination of subnetworks.
|
| Erkan Bayram |
Control Disturbance Rejection in Neural ODEs |
In this paper, we propose an iterative training algorithm for Neural ODEs that provides models resilient to control (parameter) disturbances. The method builds on our earlier work Tuning without Forgetting-and similarly introduces training points sequentially, and updates the parameters on new data within the space of parameters that do not decrease performance on the previously learned training points-with the key difference that, inspired by the concept of flat minima, we solve a minimax problem for a non-convex non-concave functional over an infinite-dimensional control space. We develop a projected gradient descent algorithm on the space of parameters that admits the structure of an infinite-dimensional Banach subspace. We show through simulations that this formulation enables the model to effectively learn new data points and gain robustness against control disturbance.
More...
In this paper, we propose an iterative training algorithm for Neural ODEs that provides models resilient to control (parameter) disturbances. The method builds on our earlier work Tuning without Forgetting-and similarly introduces training points sequentially, and updates the parameters on new data within the space of parameters that do not decrease performance on the previously learned training points-with the key difference that, inspired by the concept of flat minima, we solve a minimax problem for a non-convex non-concave functional over an infinite-dimensional control space. We develop a projected gradient descent algorithm on the space of parameters that admits the structure of an infinite-dimensional Banach subspace. We show through simulations that this formulation enables the model to effectively learn new data points and gain robustness against control disturbance.
|