Artificial Intelligence, Machine Learning, Computer Science

AI and ML for robotics, automated planning, game theory, speech recognition, data visualization, rule-based expert systems and logic programs.

ISR artificial intelligence research on hierarchical task network planning has influenced nearly all subsequent work in this area. We have deep understanding of artificial intelligence planning and the use of mean-field game theory to predict decisions. Our pioneering history in data visualization produced Spotfire, a starfield multidimensional data visualization display tool using dynamic queries and Treemaps, a space-filling method of visualizing large hierarchical collections of quantitative data that gives users the ability to see thousands of data bits in a fixed space that facilitates discovery of patterns, clusters and outliers.  We have expertise in formal methods for description and analysis of concurrent and distributed systems, model checking and abstract interpretation for embedded control and systems biology. Today we are introducing innovations in computer vision and hyperdimensional computing theory for robots, as well as geometric and scientific algorithms for autonomous vehicles, computer graphics, and virtual reality.

Recent news

Recent publications

2021

Detecting and Counting Oysters

Behzad Sadrfaridpour, Yiannis Aloimonos, Miao Yu, Yang Tao, Donald Webster

To test the idea that advancements in robotics and artificial intelligence offer the potential to improve the monitoring of oyster beds, the researchers prepared a remote operated underwater vehicle (ROV) with a camera and filmed in the Chesapeake Bay. They then used these videos to train convolutional neural networks (CNNs) to count oysters and track them in consecutive image frames so they are not identified multiple times.

arXiv.org

SpikeMS: Deep Spiking Neural Network for Motion Segmentation

Chethan M. Parameshwara, Simin Li, Cornelia Fermüller, Nitin J. Sanket, Matthew S. Evanusa, Yiannis Aloimonos

The researchers propose SpikeMS, the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation using the event-based DVS camera as input.

arXiv.org

2020

Deep Reservoir Networks with Learned Hidden Reservoir Weights using Direct Feedback Alignment

Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

The researchers present a novel Deep Reservoir Network for time series prediction and classification that learns through non-differentiable hidden reservoir layers using a biologically-inspired back propagation alternative. This alternative, called Direct Feedback Alignment, resembles global dopamine signal broadcasting in the brain. The researchers demonstrate its efficacy on two real-world multidimensional time series datasets.

arXiv.org

A Deep 2-Dimensional Dynamical Spiking Neuronal Network for Temporal Encoding trained with STDP

Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

The researchers show that a large, deep layered spiking neural network with dynamical, chaotic activity mimicking the mammalian cortex with biologically-inspired learning rules, such as STDP, is capable of encoding information from temporal data.

arXiv.org

Hybrid Backpropagation Parallel Reservoir Networks

Matthew Evanusa, Snehesh Shrestha, Michelle Girvan, Cornelia Fermüller, Yiannis Aloimonos

Demonstrates the use of a backpropagation hybrid mechanism for parallel reservoir computingwith a meta ring structure and its application on a real-world gesture recognition dataset. This mechanism can be used as an alternative to state of the art recurrent neural networks, LSTMs and GRUs.

arXiv.org

Egocentric Object Manipulation Graphs

Eadom Dessalene, Michael Maynord, Chinmaya Devaraj, Cornelia Fermüller, Yiannis Aloimonos

Introduces Egocentric Object Manipulation Graphs (Ego-OMG):a novel representation for activity modeling and anticipation of near future actions.

arXiv.org

Symbolic Representation and Learning with Hyperdimensional Computing

Anton Mitrokhin, Peter Sutor, Douglas Summers-Stay, Cornelia Fermüller, Yiannis Aloimonos

By using hashing neural networks to produce binary vector representations of images, the authors show how hyperdimensional vectors can be constructed such that vector-symbolic inference arises naturally out of their output.

Frontiers in Robotics and AI

Learning Visual Motion Segmentation using Event Surfaces

Anton Mitrokhin, Zhiyuan Hua, Cornelia Fermüller, Yiannis Aloimonos

Presents a Graph Convolutional neural network for the task of scene motion segmentation by a moving camera. Describes spatial and temporal features of event clouds, which provide cues for motion tracking and segmentation.

Computer Vision Foundation

MOMS with Events: Multi-Object Motion Segmentation with Monocular Event Cameras

Chethan M. Parameshwara, Nitin J. Sanket, Arjun Gupta, Cornelia Fermüller, Yiannis Aloimonos

A solution to multi-object motion segmentation using a combination of classical optimization methods along with deep learning and does not require prior knowledge of the 3D motion and the number and structure of objects.

arXiv.org

Following Instructions by Imagining and Reaching Visual Goals

John Kanu, Eadom Dessalene, Xiaomin Lin, Cornelia Fermüller, Yiannis Aloimonos

A novel robotic agent framework for learning to perform temporally extended tasks using spatial reasoning in a deep reinforcement learning framework, by sequentially imagining visual goals and choosing appropriate actions to fulfill imagined goals.

arXiv.org

2020

Teaching Machines to Understand Urban Networks: A Graph Autoencoder Approach

Maria Coelho, Mark Austin, Shivam Mishra, Mark Blackburn

Due to remarkable advances in computer, communications and sensing technologies over the past three decades,large-scale urban systems are now far more heterogeneous and automated than their predecessors. They may, in fact, be connected to other types of systems in completely new ways. These characteristics make the tasks of system design, analysis and integration of multi-disciplinary concerns much more difficult than in the past. We believe these challenges can be addressed by teaching machines to understand urban networks. This paper explores opportunities for using a recently developed graph autoencoding approach to encode the structure and associated network attributes as low-dimensional vectors. We exercise the proposed approach on a problem involving identification of leaks in urban water distribution systems.

IARIA International Journal on Advances in Networks and Services

2021

Online Deterministic Annealing for Classification and Clustering

Christos Mavridis, John Baras

An online prototype-based learning algorithm for clustering and classification, based on the principles of deterministic annealing.

arXiv.org

2020

Models and Methods for Intelligent Highway Routing of Human-Driven and Connected-and-Automated Vehicles

Fatemeh Alimardani, Nilesh Suriyarachchi, Faizan Tariq, John Baras

Explores the integration of two of the most common traffic management strategies, namely, ramp metering and route guidance, into existing highway networks with human-driven vehicles.

Chapter in the forthcoming book, Transportation Systems for Smart, Sustainable, Inclusive and Secure Cities

Convergence of Stochastic Vector Quantization and Learning Vector Quantization with Bregman Divergences

Christos Mavridis, John Baras

The researchers investigate the convergence properties of stochastic vector quantization (VQ) and its supervised counterpart, Learning Vector Quantization (LVQ), using Bregman divergences. We employ the theory of stochastic approximation to study the conditions on the initialization and the Bregman divergence generating functions, under which,the algorithms converge to desired configurations. These results formally support the use of Bregman divergences, such as the Kullback-Leibler divergence, in vector quantization algorithms.

johnbaras.com

Order Effects of Measurements in Multi-Agent Hypothesis Testing

Aneesh Raghavan, John Baras

This paper pertains to stochastic multi-agent decision-making problems. The authors revisit the concepts of event-state-operation-structure and relationship of incompatibility from literature, and use them as a tool to study the algebraic structure of a set of events. They consider a multi-agent hypothesis testing problem and show that the set of events forms an ortholattice. They then consider the binary hypothesis testing problem wth finite observation space.

arXiv.org

Cooperative Hypothesis Testing by Two Obervers with Asymmetric Information

Aneesh Raghavan, John Baras

This paper pertains to hypothesis testing problems, specifically the problem of collaborative binary hypothesis testing.

arXiv.org

Interpretable machine learning models: A physics-based view

Ion Matei, Johan de Kleer, Christoforos Somarakis, Rahul Rai, John Baras

To understand changes in physical systems and facilitate decisions, explaining how model predictions are made is crucial. In this paper the authors use model-based interpretability, where models of physical systems are constructed by composing basic constructs that explain locally how energy is exchanged and transformed.

arXiv.org

2019

Event-Triggered Add-on Safety for Connected and Automated Vehicles using Roadside Network Infrastructure

Mohammad Mamduhi, Karl Johansson, Ehsan Hashemi, John Baras

This paper proposes an event-triggered, add-on safety mechanism in a networked vehicular system that can adjust control parameters for timely braking while maintaining maneuverability.

arXiv.org

2021

Generalized AdaGrad (G-AdaGrad) and Adam: A State Space Perspective/a>

Kushal Chakrabarti, Nikhil Chopra

Accelerated gradient-based methods are being extensively used for solving non-convex machine learning problems, especially when the data points are abundant or the available data is distributed across several agents. Two of the prominent accelerated gradient algorithms are AdaGrad and Adam. AdaGrad is the simplest accelerated gradient method, particularly effective for sparse data. Adam has been shown to perform favorably in deep learning problems compared to other methods. Here the authors propose a new fast optimizer, Generalized AdaGrad (G-AdaGrad), for accelerating the solution of potentially non-convex machine learning problems.

arXiv.org

2020

Iterative Pre-Conditioning for Expediting theGradient-Descent Method:The Distributed Linear Least-Squares Problem

Kushal Chakrabarti, Nirupam Gupta, Nikhil Chopra

This paper considers the multi-agent linear least-squares problem in a server-agent network. The system comprises multiple agents, each having a set of local data points, that are connected to a server. The goal for the agents is to compute a linear mathematical model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. The paper proposes an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method.

arXiv.org

2020

Temporal-Logic Query Checking over Finite Data Streams

Samuel Huang, Rance Cleaveland

This paper describes a technique for inferring temporal-logic properties for sets of finite data streams. Such data streams arise in many domains, including server logs, program testing, and financial and marketing data; temporal-logic formulas that are satisfied by all data streams in a set can provide insight into the underlying dynamics of the system generating these streams. The authors' approach makes use of so-called Linear Temporal Logic (LTL) queries, which are LTL formulas containing a missing subformula and interpreted over finite data streams. Solving such a query involves computing a subformula that can be inserted into the query so that the resulting grounded formula is satisfied by all data streams in the set. The paper describes an automaton-driven approach to solving this query-checking problem and demonstrates a working implementation via a pilot study.

arXiv.org

Timed Automata Benchmark Description

Peter Fontana, Rance Cleaveland

This report contains the descriptions of the timed automata (models) and the prop-erties (specifications) that are used as the “benchmark examples in Data structure choices for on-the-fly model checking of real-time systems” and “The power of proofs: New algorithms for timed automata model checking.” The four models from those sources are: CSMA, FISCHER, LEADER, and GRC. Additionally we include in this re-port two additional models: FDDI and PATHOS. These six models are often used to benchmark timed automata model checker speed throughout timed automata model checking papers.

arXiv.org

Better Automata through Process Algebra

Rance Cleaveland

This paper shows how the use of Structural Operational Semantics (SOS) inthe style popularized by the process-algebra community can lead to a more succinct and useful construction for building finite automata from regular expressions.

arXiv.org

2021

Speech acoustics and mental health assessment

Carol Espy-Wilson

Dr. Espy-Wilson discusses a speech inversion system her group has developed that maps the acoustic signal to vocal tract variables (TVs). The trajectories of the TVs show the timing and spatial movement of speech gestures. She explains how her group uses machine learning techniques to compute articulatory coordination features (ACFs) from the TVs. The ACFs serve as an input into a deep learning model for mental health classification. Espy-Wilson also illustrates the key acoustic differences between speech produced by subjects when they are mentally ill relative to when they are in remission and relative to healthy controls. The ultimate goal of this research is the development of a technology (perhaps an app) for patients that can help them, their therapists and caregivers monitor their mental health status between therapy sessions.

Keynote speech at the 2021 Acoustical Society of America Annual Meeting, June 8, 2021
View a press release from the Acoustical Society of America about this speech

Speech-based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model

Nadee Seneviratne, Carol Espy-Wilson

The paper proposes a new multi-stage architecture trained on vocal tract variable (TV)-based articulatory coordination features (ACFs) for depression severity classification which clearly outperforms the baseline models. The authors establish that the robustness of ACFs based on TVs holds beyond mere detection of depression and even in severity level classification. This work can be extended to develop a multi-modal system that can take advantage of textual information obtained through Automatic Speech Recognition tools. Linguistic features can reveal important information regarding the verbal content of a depressed patient relating to their mental health condition.

arXiv.org; accepted for Interspeech2021, Aug. 30-Sept. 3, 2021

Inverted Vocal Tract Variables and Facial Action Units to Quantify Neuromotor Coordination in Schizophrenia

Yashish Maduwantha, Chris Kitchen, Deanna L. Kelly, Carol Espy-Wilson

This study, conducted with AIM-HI funding, investigates speech articulatory coordination in schizophrenia subjects exhibiting strong positive symptoms (e.g.hallucinations and delusions), using a time delay embedded correlation analysis. It finds a distinction between healthy and schizophrenia subjects in neuromotor coordination in speech.

ResearchGate.net

2020

Modeling Feature Representations for Affective Speech using Generative Adversarial Networks

Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

Implements three auto-encoder and GAN based models to synthetically generate higher dimensional feature vectors useful for speech emotion recognition from a simpler prior distribution pz.

IEEE Transactions on Affective Computing

2019

Multi-modal learning for speech emotion recognition: An analysis and comparison of ASR outputs with ground truth transcription

Saurabh Sahu, Vikramjit Mitra, Nadee Seneviratne, Carol Espy-Wilson

The paper leverages multi-modal learning and automated speech recognition (ASR) systems toward building a speech-only emotion recognition model.

Interspeech 2019

2021

SpikeMS: Deep Spiking Neural Network for Motion Segmentation

Chethan M. Parameshwara, Simin Li, Cornelia Fermüller, Nitin J. Sanket, Matthew S. Evanusa, Yiannis Aloimonos

The researchers propose SpikeMS, the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation using the event-based DVS camera as input.

arXiv.org

2020

Deep Reservoir Networks with Learned Hidden Reservoir Weights using Direct Feedback Alignment

Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

The researchers present a novel Deep Reservoir Network for time series prediction and classification that learns through non-differentiable hidden reservoir layers using a biologically-inspired back propagation alternative. This alternative, called Direct Feedback Alignment, resembles global dopamine signal broadcasting in the brain. The researchers demonstrate its efficacy on two real-world multidimensional time series datasets.

arXiv.org

A Deep 2-Dimensional Dynamical Spiking Neuronal Network for Temporal Encoding trained with STDP

Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

The researchers show that a large, deep layered spiking neural network with dynamical, chaotic activity mimicking the mammalian cortex with biologically-inspired learning rules, such as STDP, is capable of encoding information from temporal data.

arXiv.org

Hybrid Backpropagation Parallel Reservoir Networks

Matthew Evanusa, Snehesh Shrestha, Michelle Girvan, Cornelia Fermüller, Yiannis Aloimonos

Demonstrates the use of a backpropagation hybrid mechanism for parallel reservoir computingwith a meta ring structure and its application on a real-world gesture recognition dataset. This mechanism can be used as an alternative to state of the art recurrent neural networks, LSTMs and GRUs.

arXiv.org

Egocentric Object Manipulation Graphs

Eadom Dessalene, Michael Maynord, Chinmaya Devaraj, Cornelia Fermüller, Yiannis Aloimonos

Introduces Egocentric Object Manipulation Graphs (Ego-OMG):a novel representation for activity modeling and anticipation of near future actions.

arXiv.org

Symbolic Representation and Learning with Hyperdimensional Computing

Anton Mitrokhin, Peter Sutor, Douglas Summers-Stay, Cornelia Fermüller, Yiannis Aloimonos

By using hashing neural networks to produce binary vector representations of images, the authors show how hyperdimensional vectors can be constructed such that vector-symbolic inference arises naturally out of their output.

Frontiers in Robotics and AI

Learning Visual Motion Segmentation using Event Surfaces

Anton Mitrokhin, Zhiyuan Hua, Cornelia Fermüller, Yiannis Aloimonos

Presents a Graph Convolutional neural network for the task of scene motion segmentation by a moving camera. Describes spatial and temporal features of event clouds, which provide cues for motion tracking and segmentation.

Computer Vision Foundation

MOMS with Events: Multi-Object Motion Segmentation with Monocular Event Cameras

Chethan M. Parameshwara, Nitin J. Sanket, Arjun Gupta, Cornelia Fermüller, Yiannis Aloimonos

A solution to multi-object motion segmentation using a combination of classical optimization methods along with deep learning and does not require prior knowledge of the 3D motion and the number and structure of objects.

arXiv.org

Following Instructions by Imagining and Reaching Visual Goals

John Kanu, Eadom Dessalene, Xiaomin Lin, Cornelia Fermüller, Yiannis Aloimonos

A novel robotic agent framework for learning to perform temporally extended tasks using spatial reasoning in a deep reinforcement learning framework, by sequentially imagining visual goals and choosing appropriate actions to fulfill imagined goals.

arXiv.org

2019

Simulation-based algorithms for Markov decision processes: Monte Carlo tree search from AlphaGo to AlphaZero

Michael Fu

The deep neural networks of AlphaGo and AlphaZero can be traced back to an adaptive multistage sampling (AMS) simulation-based algorithm for Markov decision processed published by HS Chang, Michael C. Fu and Steven I Marcus in Operations Research in 2005. Here, Fu retraces history, talks about the impact of the initial research, and suggests enhancements for the future.

Asian-Pacific Journal of Operational Research

2020

Automatic Shape Optimization of Patient-Specific Tissue Engineered Vascular Grafts for Aortic Coarctation

Xiaolong Liu, Seda Aslan, Rachel Hess, Paige Mass, Laura Olivieri, Yue-Hin Loke, Narutoshi Hibino, Mark Fuge, Axel Krieger

Develops a computational framework for automatically designing optimal shapes of patient-specific TEVGs for aorta surgery.

42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society

Adaptive Expansion Bayesian Optimization for Unbounded Global Optimization

Wei Chen, Mark Fuge

The authors propose a Bayesian optimization approach that only needs to specify an initial search space that does not necessarily include the global optimum, and expands the search space when necessary.

arXiv.org

2021

Binaural Audio Generation via Multi-task Learning

Sijia Li, Shiguang Liu, Dinesh Manocha

A learning-based approach for generating binaural audio from mono audio using multi-task learning.

arXiv.org

AgentDress: Realtime Clothing Synthesis for Virtual Agents using Plausible Deformations

Nannan Wu, Qianwen Chao, Yanzhen Chen, Weiwei Xu, Chen Liu, Dinesh Manocha, Wenxin Sun, Yi Han, Xinran Yao, Xiaogang Jin

A CPU-based real-time cloth animation method for dressing virtual humans of various shapes and poses.

IEEE Transactions on Visualization and Computer Graphics

DeepEigen: Learning-based Modal Sound Synthesis with Acoustic Transfer Maps

Xutong Jin, Sheng Li, Dinesh Manocha, Guoping Wang

A learning-based approach to compute the eigenmodes and acoustic transfer data for the sound synthesis of arbitrary solid objects. The approach combines two network-based solutions to formulate a complete learning-based 3D modal sound model.

arXiv.org

DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes

Dongki Jung, Jaehoon Choi, Yonghan Lee, Deokhwa Kim, Changick Kim, Dinesh Manocha, Donghwan Lee

This approach estimates depth from a monocular camera as it moves through complex and crowded indoor environments, e.g., a department store or a metro station. The approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people, by training on dynamic scenes.

arXiv.org

Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning

Uttaran Bhattacharya, Elizabeth Childs, Nicholas Rewkowski, Dinesh Manocha

A generative adversarial network to synthesize 3D pose sequences of co-speech upper-body gestures with appropriate affective expressions.

arXiv.org

TIMERS: Document-Level Temporal Relation Extraction

Puneet Mathur, Rajiv Jain, Franck Dernoncourt, Vlad Morariu, Quan Hung Tran, Dinesh Manocha

TIMERS is a TIME, Rhetorical and Syntactic-aware model for document-level temporal relation classification. TIMERS leverages rhetorical discourse features and temporal arguments from semantic role labels, in addition to traditional local syntactic features, trained through a Gated Relational-GCN.

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (short papers)

Improving Reverberant Speech Separation with Multi-Stage Training and Curriculum Learning

Rohith Aralikatti, Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

A new approach to improving the performance of reverberant speech separation, based on an accurate geometric acoustic simulator (GAS) which generates realistic room impulse responses (RIRs) by modeling both specular and diffuse reflections.

arXiv.org

Redirected Walking in Static and Dynamic Scenes Using Visibility Polygons

Niall L. Williams, Aniket Bera, Dinesh Manocha

A new approach for redirected walking in static and dynamic virtual environment scenes that uses techniques from robot motion planning to compute the redirection gains that steer the user on collision-free paths in the physical space.

IEEE Transactions on Visualization and Computer Graphics

Point-based Acoustic Scattering for Interactive Sound Propagation via Surface Encoding

Hsien-Yu Meng, Zhenyu Tang, Dinesh Manocha

A novel geometric deep learning method to compute the acoustic scattering properties of geometric objects. This learning algorithm uses a point cloud representation of objects to compute the scattering properties and integrates them with ray tracing for interactive sound propagation in dynamic scenes.

arXiv.org

Learning Acoustic Scattering Fields for Dynamic Interactive Sound Propagation

Zhenyu Tang, Hsien-Yu Meng, Dinesh Manocha

A novel hybrid sound propagation algorithm for interactive applications. The approach is designed for dynamic scenes and uses a neural network-based learned scattered field representation along with ray tracing to efficiently generate specular, diffuse, diffraction and occlusion effects.

2021 IEEE Conference on Virtual Reality and 3D User Interfaces

Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents

Uttaran Bhattacharya, Nicholas Rewkowski, Abhishek Banerjee, Pooja Guhan, Aniket Bera, Dinesh Manocha

Text2Gestures is a transformer-based learning method to interactively generate emotive full-body gestures for virtual agents aligned with natural language text inputs.

2021 IEEE Conference on Virtual Reality and 3D User Interfaces

Scene-aware Far-field Automatic Speech Recognition

Zhenyu Tang, Dinesh Manocha

A novel method for generating scene-aware training data for far-field automatic speech recognition, using a deep learning-based estimator to non-intrusively compute the sub-band reverberation time of an environment from its speech samples.

arXiv.org

Redirection using Alignment

Niall Williams, Aniket Bera, Dinesh Manocha

The authors provide a generalized definition of alignment that allows it to be used in any research problem. They present an example of how alignment can be used to yield significant improvements in VR locomotion with redirected walking.

2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)

Multimodal and Context-Aware Emotion Perception Model with Multiplicative Fusion

Trisha Mittal, Aniket Bera, Dinesh Manocha

A learning model for multimodal context-aware emotion recognition that combines multiple human co-occurring modalities(such as facial, audio, textual, and pose/gaits) and two interpretations of context.

IEEE Multimedia

TS-RIR: Translated synthetic room impulse responses for speech augmentation

Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

A method for improving the quality of synthetic room impulse responses for far-field speech recognition. The authors bridge the gap between the fidelity of synthetic room impulse responses (RIRs) and real room impulse responses using a novel, TS-RIRGAN architecture.

arXiv.org

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha

Affect2MM is a learning method for time-series emotion prediction for multimedia content. Its goal is to automatically capture varying emotions depicted by characters in real-life human-centric situations and behaviors. This method uses ideas from emotion causation theories to computationally model and determine the emotional state evoked in movie clips.

arXiv.org

An Overview of Enhancing Distance Learning through Augmented and Virtual Reality Technologies

Amanuel Awoke, Hugo Burbelo, Elizabeth Childs, Ferzam Mohammad, Logan Stevens, Nicholas Rewkowski, Dinesh Manocha

Distance learning presents a number of challenges. The authors identify four: the lack of social interaction, reduced student engagement and focus, reduced comprehension and information retention, and the lack of flexible and customizable instructor resources. They then examine how AR/VR technologies might address each challenge, and outline the further research that is required to fully understand the potential.

arXiv.org

Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents

Uttaran Bhattacharya, Nicholas Rewkowski, Abhishek Banerjee, Pooja Guhan, Aniket Bera, Dinesh Manocha

Text2Gestures is a transformer-based learning method to interactively generate emotive full-body gestures for virtual agents aligned with natural language text inputs. This method generates emotionally expressive gestures by utilizing relevant biomechanical features for body expressions, also known as affective features.

arXiv.org

Example-based Real-time Clothing Synthesis for Virtual Agents

Nannan Wu, Qianwen Chao, Yanzhen Chen, Weiwei Xu, Chen Liu, Dinesh Manocha, Wenxin Sun, Yi Han, Xinran Yao, Xiaogang Jin

A real-time cloth animation method for dressing virtual humans of various shapes and poses. The approach formulates clothing deformation as a high-dimensional function of body shape parameters and pose parameters.

arXiv.org

2020

Fine-Grained Vehicle Perception via 3DPart-Guided Visual Data Augmentation

Feixiang Lu, Zongdai Liu, Hui Miao, Peng Wang, Liangjun Zhang, Ruigang Yang, Dinesh Manocha, Bin Zhou

Holistically understanding an object and its 3D movable parts through visual perception models is essential for enabling an autonomous agent to interact with the world. For autonomous driving, the dynamics and states of vehicle parts such as doors, the trunk, and the bonnet can provide meaningful semantic information and interaction states, which are essential to ensure the safety of the self-driving vehicle. Existing visual perception models mainly focus on coarse parsing such as object bounding box detection or pose estimation and rarely tackle these situations. In this paper, the authors address this important problem for autonomous driving by solving two critical issues using visual data augmentation.

arXiv.org

Self-Illusion: A Study on High-Level Cognition of Role-Playing in Immersive Virtual Environments from Non-Human Perspective

Sheng Li, Xiang Gu, Kangrui Yi, Yanlin Yang, Guoping Wang, Dinesh Manocha

This experiment investigated the occurrence of self-illusion and its contribution to realistic behavior consistent with a virtual role in virtual environments.

IEEE Transactions on Visualization and Computer Graphics

AutoTrajectory: Label-Free Trajectory Extraction and Prediction from Videos Using Dynamic Points

Yuexin Ma, Xinge Zhu, Xinjing Cheng, Ruigang Yang, Jiming Liu, Dinesh Manocha

A label-free algorithm for trajectory extraction and prediction to use raw videos directly. To better capture the moving objects in videos, the authors introduce dynamic points to model dynamic motions by using a forward-backward extractor to keep temporal consistency and using image reconstruction to keep spatial consistency in an unsupervised manner. The method is the first to achieve unsupervised learning of trajectory extraction and prediction.

2020 European Conference on Computer Vision

CPPM: chi-squared progressive photon mapping

Zehui Lin, Sheng Li, Xinlu Zeng, Congyi Zhang, Jinzhu Jia, Guoping Wang, Dinesh Manocha

This chi-squared progressive photon mapping algorithm (CPPM) constructs an estimator by controlling the bandwidth to obtain superior image quality.

ACM Transactions on Graphics

Sound Synthesis, Propagation, and Rendering: A Survey

Shiguang Liu, Dinesh Manocha

This is a broad overview of research on sound simulation in virtual reality, games, etc. It first surveys various sound synthesis methods,including harmonic synthesis, texture synthesis, spectral analysis, and physics-based synthesis. Then, it summarizes popular sound propagation techniques, namely wave-based methods, geometric-based methods, and hybrid methods. Next, sound rendering methods are reviewed. The authors also highlight some recent methods that use machine learning techniques for synthesis, propagation, and some inverse problems.

arXiv.org

ABC-Net: Semi-Supervised Multimodal GAN-based Engagement Detection using an Affective, Behavioral and Cognitive Model

Pooja Guhan, Manas Agarwal, Naman Awasthi, Gloria Reeves, Dinesh Manocha, Aniket Bera

ABC-Net is a semi-supervised multi-modal GAN framework based on psychology literature that detects engagement levels in video conversations. It uses three constructs—behavioral, cognitive, and affective engagement—to extract various features that can effectively capture engagement levels.

arXiv.org

Generating Emotive Gaits for Virtual Agents Using Affect-BasedAutoregression

Uttaran Bhattacharya, Nicholas Rewkowski, Pooja Guhan, Niall L. Williams, Trisha Mittal, Aniket Bera, Dinesh Manocha

This autoregression network generates virtual agents that convey various emotions through their walking styles or gaits.

arXiv.org

SelfDeco: Self-Supervised Monocular Depth Completion in Challenging Indoor Environments

Jaehoon Choi, Dongki Jung, Yonghan Lee, Deokhwa Kim, Dinesh Manocha, Donghwan Lee

An algorithm for self-supervised monocular depth completion in robotic navigation, computer vision and autonomous driving. The approach is based on training a neural network that requires only sparse depth measurements and corresponding monocular video sequences without dense depth labels. Our self-supervised algorithm is designed for challenging indoor environments with textureless regions, glossy and transparent surface, non-Lambertian surfaces, moving people, longer and diverse depth ranges and scenes captured by complex ego-motions.

arXiv.org

LCollision: Fast Generation of Collision-Free Human Poses using Learned Non-Penetration Constraints

Qingyang Tan, Zherong Pan, Dinesh Manocha

LCollision is a learning-based method that synthesizes collision-free 3D human poses. LCollision is the first approach that can obtain high accuracy in handling non-penetration and collision constraints in a learning framework.

arXiv.org

StylePredict: Machine Theory of Mind for Human Driver Behavior from Trajectories

Rohan Chandra, Aniket Bera, Dinesh Manocha

Autonomous vehicles behave conservatively in a traffic environment with human drivers and do not adapt to local conditions and socio-cultural norms. However, socially aware AVs can be designed if there exists a mechanism to understand the behaviors of human drivers. In this example of Machine Theory of Mind (M-ToM) the authors infer the behaviors of human drivers by observing the trajectory of their vehicles. "StylePredict" is based on trajectory analysis of vehicles. It mimics human ToM to infer driver behaviors, or styles, using a computational mapping between the extracted trajectory of avehicle in traffic and the driver behaviors using graph-theoretic techniques, including spectral analysis and centrality functions. StylePredict can analyze driver behavior in the USA, China, India, and Singapore, based on traffic density, hetero-geneity, and conformity to traffic rules.

arXiv.org

B-GAP: Behavior-Guided Action Prediction for Autonomous Navigation

Angelos Mavrogiannis, Rohan Chandra, Dinesh Manocha

A learning algorithm for action prediction and local navigation for autonomous driving that classifies the driver behavior of other vehicles or road-agents (aggressive or conservative) and takes that into account for decision making and safe driving.

arXiv.org

BoMuDA: Boundless Multi-Source Domain Adaptive Segmentation in Unconstrained Environments

Divya Kothandaraman, Rohan Chandra, Dinesh Manocha

An unsupervised multi-source domain adaptive semantic segmentation approach for autonomous vehicles in unstructured and unconstrained traffic environments.

arXiv.org

CubeP Crowds: Crowd Simulation Integrated into“Physiology-Psychology-Physics” Factors

Mingliang Xu, Chaochao Li, Pei Lv, Wei Chen, Zhigang Deng, Bing Zhou, Dinesh Manocha

CubeP is a model for crowd simulation that comprehensively considers physiological, psychological, and physical factors. Inspired by the theory of “the devoted actor”, the model determines the movement of each individual by modeling the physical influence from physical strength and emotion. This is the first time that physiological, psychological, and physical factors are integrated in a unified manner, and the relationship between the factors is explicitly determined. The new model is capable of generating effects similar to real-world scenarios and can also reliably predict the changes in the physical strength and emotion of individuals in an emergency situation.

arXiv.org

Deep-Modal: Real-Time Impact Sound Synthesis for Arbitrary Shapes

Xutong Jin, Sheng Li, Tianshu Qu, Dinesh Manocha, Guoping Wang

Model sound synthesis is a physically-based sound synthesis method used to generate audio content in games and virtual worlds. This paper presents a novel learning-based impact sound synthesis algorithm called Deep-Modal. The approach can handle sound synthesis for common arbitrary objects, especially dynamic generated objects, in real time.

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Learning Acoustic Scattering Fields for Highly Dynamic Interactive Sound Propagation

Zhenyu Tang, Hsien-Yu Meng, Dinesh Manocha

A novel hybrid sound propagation algorithm for interactive applications.

arXiv.org

Multi-Window Data Augmentation Approach for Speech Emotion Recognition

Sarala Padi, Dinesh Manocha, Ram Sriram

A novel, Multi-Window Data Augmentation(MWA-SER) approach for speech emotion recognition.MWA-SER is a unimodal approach that focuses on two key concepts; designing the speech augmentation method to generate additional data samples and building the deep learning models to recognize the underlying emotion of an audio signal.

arXiv.org

IR-GAN: Room Impulse Response Generator for Speech Augumentation

Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

The paper presents a Generative Adversarial Network (GAN) based room impulse response generator for generating realistic synthetic room impulse responses.

arXiv.org

P-Cloth: Interactive Complex Cloth Simulation on Multi-GPU Systems using Dynamic Matrix Assembly and Pipelined Implicit Integrators

Cheng Li, Min Tang, Ruofeng tong, Ming Cai, Jieyi Zhao, Dinesh Manocha

Cloth simulation is an active area of research in computer graphics, computer-aided design (CAD) and the fashion industry. Over the last few decades many methods have been proposed for solving the underlying dynamical system with robust collision handling. The paper presents a novel parallel algorithm for cloth simulation that exploits multiple GPUs for fast computation and the handling of very high resolution meshes. It is the first approach that can perform almost interactive complex cloth simulation with wrinkles, friction and folds on commodity workstations.

arXiv.org

PerMO: Perceiving More at Once from a Single Image for Autonomous Driving

Feixiang Lu, Zongdai Liu, Xibin Song, Dingfu Zhou, Wei Li, Hui Miao, Miao Liao, Liangjun Zhang, BinZhou, Ruigang Yang, Dinesh Manocha

The paper presents a robust and effective approach to reconstruct complete 3D poses and shapes of vehicles from a single image. It introduces a novel part-level representation for vehicle segmentation and 3D reconstruction, which significantly improves performance.

arXiv.org

SPA: Verbal Interactions between Agents and Avatars in Shared Virtual Environments using Propositional Planning

Andrew Best, Sahil Narang, Dinesh Manocha

Sense-Plan-act (SPA) is a new approach for generating plausible verbal interactions between virtual human-like agents and user avatars in shared virtual environments. It extends prior work in propositional planning and natural language processing to enable agents to plan with uncertain information, and leverage question and answer dialogue with other agents and avatars to obtain the needed information and complete their goals. The agents are additionally able to respond to questions from the avatars and other agents using natural-language enabling real-time multi-agent multi-avatar communication environments.

arXiv.org

RoadTrack: Realtime Tracking of Road Agents in Dense and Heterogeneous Environments

Rohan Chandra, Uttaran Bhattacharya, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

RoadTrack is a realtime tracking algorithm for autonomous driving that tracks heterogeneous road-agents in dense traffic videos. The approach is designed for dense traffic scenarios that consist of different road-agents such as pedestrians, two-wheelers, cars, buses, etc. sharing the road.

GAMMA website

DGaze: CNN-Based Gaze Prediction in Dynamic Scenes

Zhiming Hu, Sheng Li, Congyi Zhang, Kangrui Yi, Guoping Wang, Dinesh Manocha

DGaze is a CNN-based model that combines object position sequence, head velocity sequence, and saliency features to predict users' gaze positions in HMD-based applications. The model can be applied to predict not only real-time gaze positions but also gaze positions in the near future and can achieve better performance than prior method.

IEEE Transactions on Visualization and Computer Graphics

RANDM: Random Access Depth Map Compression UsingRange-Partitioning and Global Dictionary

Srihari Pratapa, Dinesh Manocha

RANDM is a random-access depth map compression algorithm for interactive rendering. The compressed representation provides random access to the depth values and enables real-time parallel decompression on commodity hardware. This method partitions the depth range captured in a given scene into equal-sized intervals and uses this partition to generate three separate components that exhibit higher coherence. Each of these components is processed independently to generate the compressed stream.

GAMMA website

CMetric: A Driving Behavior Measure Using Centrality Functions

Rohan Chandra, Uttaran Bhattacharya, Trisha Mittal, Aniket Bera, Dinesh Manocha

CMetric classifies driver behaviors using centrality functions. The formulation combines concepts from computational graph theory and social traffic psychology to quantify and classify the behavior of human drivers. CMetric is used to compute the probability of a vehicle executing a driving style, as well as the intensity used to execute the style. This approach is designed for real-time autonomous driving applications, where the trajectory of each vehicle or road-agent is extracted from a video.

arXiv.org

Emotions Don’t Lie: A Deepfake Detection Method using Audio-Visual Affective Cues

Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

The paper presents a learning-based method for detecting fake videos. The authors use the similarity between audio-visual modalities and the similarity between the affective cues of the two modalities to infer whether a video is “real” or “fake.”

arXiv.org

EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege’sPrinciple

Trisha Mittal, Pooja Guhan, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

EmotiCon is a learning-based algorithm for context-aware perceived human emotion recognition from videos and images. It uses multiple modalities of faces and gaits, background visual information and socio-dynamic inter-agent interactions to infer the perceived emotion. EmotiCon outperforms prior context-aware emotion recognition methods.

arXiv.org

MCQA: Multimodal Co-attention Based Network for Question Answering

Abhishek Kumar, Trisha Mittal, Dinesh Manocha

MCQA is a learning-based algorithm for multimodal question answering that explicitly fuses and aligns the multi-modal input (i.e. text, audio, and video) forming the context for the query (question and answer).

arXiv.org

Scene-aware Sound Rendering in Virtual and Real Worlds

Zhenyu Tang, Dinesh Manocha

Modern computer graphics applications including virtual reality and augmented reality have adopted techniques for both visual rendering and audio rendering. While visual rendering can already synthesize virtual objects into the real world seamlessly, it remains difficult to correctly blend virtual sound with real-world sound using state-of-the-art audio rendering. When the virtual sound is generated unaware of the scene, the corresponding application becomes less immersive, especially for AR. The authors present their current work on generating scene-aware sound using ray-tracing based simulation combined with deep learning and optimization.

2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops

Interactive Geometric Sound Propagation and Rendering

Micah Taylor, Anish Chandak, Lakulish Antani, Dinesh Manocha

An algorithm and system for sound propagation and rendering in virtual environments and media applications. The approach uses geometric propagation techniques for fast computation of propagation paths from a source to a listener and takes into account specular reflections, diffuse reflections, and edge diffraction.

Intel

Reactive Navigation Under Non-Parametric Uncertainty Through Hilbert Space Embedding of Probabilistic Velocity Obstacles

SriSai Naga Jyotish Poonganam, Bharath Gopalakrishnan, Venkata Seetharama Sai Bhargav Kumar Avula, K. Madhava Krishna, Arun Kumar Singh, Dinesh Manocha

A new model predictive control framework that improves reactive navigation for autonomous robots. The framework allows roboticists to compute low cost control inputs while ensuring some upper bound on the risk of collision.

IEEE Robotics and Automation Letters

RoadTrack: Realtime Tracking of Road Agents in Dense and Heterogeneous Environments

Dinesh Manocha, Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, Tanmay Randhavane

The authors' RoadTrack algorithm could help autonomous vehicles navigate dense traffic scenarios. The algorithm uses tracking-by-detection approach to detect vehicles and pedestrians, then predict where they are going.

IROS 2019

2019

The Liar’s Walk: Detecting Deception with Gait and Gesture

Kurt Gray, Tanmay Randhavane, Kyra Kapsaskis, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

A data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures.

arXiv.org

Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs

Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

A novel approach for traffic forecasting in urban traffic scenarios using a combination of spectral graph analysis and deep learning.

arxiv.org

NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations

Qiaoyun Wu, Dinesh Manocha, Jun Wang, Kai Xu

The authors improve the cross-target and cross-scene generalization of visual navigation through a learning agent guided by conceiving the next observations it expects to see. A variational Bayesian model, NeoNav, generates the next expected observations (NEO) conditioned on the current observations of the agent and the target view.

PDF

Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping

Uttaran Bhattacharya, Christian Roncal, Trisha Mittal, Rohan Chandra,Aniket Bera, Dinesh Manocha

The paper presents an autoencoder-based semi-supervised approach to classify perceived human emotions from walking styles obtained from videos or from motion-captured data and represented as sequences of 3D poses.

arxiv.org

Personality-Aware Probabilistic Map for Trajectory Prediction of Pedestrians

Chaochao Li, Pei Lv, Mingliang Xu, Xinyu Wang, Dinesh Manocha, Bing Zhou, Meng Wang

In many applications such as human-robot interaction, autonomous driving or surveillance, it is important to accurately predict pedestrian trajectories for collision-free navigation or abnormal behavior detection. The authors present a novel trajectory prediction algorithm for pedestrians based on a personality-aware probabilistic feature map.

arxiv.org

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane (UNC), Aniket Bera, and Dinesh Manocha

STEP is a novel classifier network able to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network architecture. Given an RGB video of an individual walking, STEP implicitly exploits the gait features to classify the emotional state of the human into one of four emotions: happy, sad, angry, or neutral. | Watch a video about STEP |

arxiv.org

GraphRQI: Classifying Driver Behaviors Using Graph Spectrums

Rohan Chandra, Uttaran Bhattacharya, Trisha Mittal, Xiaoyu Li, Aniket Bera, Dinesh Manocha

The GraphRQI algorithm identifies driver behaviors from road agent trajectories. It is 25 percent more accurate over prior behavior classification algorithms for autonomous vehicles.  | Watch a video about GraphRQI |

arxiv.org

Realtime Simulation of Thin-Shell Deformable Materials using CNN-Based Mesh Embedding

Qingyang Tan, Zherong Pan, Lin Gao, and Dinesh Manocha

A new method bridges the gap between mesh embedding and physical simulation for efficient dynamic models of clothes. The key technique is a graph-based convolutional neural network (CNN) defined on meshes with arbitrary topologies, and a new mesh embedding approach based on physics-inspired loss term. After training, the learned simulator runs10–100 times faster and the accuracy is high enough for robot manipulation tasks. | Watch a video about this method |

arxiv.org

2020

Using Data Partitions and Stateless Servers to Scale Up Fedora Repositories

Gregory Jansen, Aaron Coburn, Adam Soroka, Richard Marciano

Describes the development and testing of the next-generation Trellis Linked Data Platform with Memento versioning support.

dcicblog.umd.edu

2019

Computational thinking in archival science research and education

William Underwood and Richard Marciano

This paper explores whether the computational thinking practices of mathematicians and scientists in the physical and biological sciences are also the practices of archival scientists. It is argued that these practices are essential elements of an archival science education in preparing students for a professional archival career.

Reframing Digital Curation Practices through a Computational Thinking Framework

Marciano and 20  students in the Digital Curation Innovation Center developed a reframing model for digital curation through computational thinking. Their case study involves adding metadata to non-digital primary records from the WWII TuleLake Japanese American Internment Camp. Their curation methods led to the discovery of new narratives and connections from this data.

2021

Pairwise Comparison Evolutionary Dynamics with Strategy-Dependent Revision Rates: Stability and δ-Passivity (Expanded Version)

Semih Kara, Nuno Martins

Investigate methods to characterize the stability of a continuous-time dynamical system that models the dynamics of non-cooperative strategic interactions among the members of large populations of bounded rationality agents.

arXiv.org

2020

Dissipativity Tools for Convergence to Nash Equilibria in Population Games

Murat Arcak, Nuno Martins

Presents dissipativity tools to establish global asymptotic stability of the set of Nash equilibria in a deterministic model of population games.

arXiv.org

2021

A Demonstration of Refinement Acting, Planning and Learning System Using Operational Models

Sunandita Patra, James Mason, Malik Ghallab, Paolo Traverso, Dana Nau

The authors demonstrate a system with integrated acting, planning and learning algorithms that uses hierarchical operational models to perform tasks in dynamically changing environments. In this acting and planning engine, both planning and acting use the same operational models. These rely on hierarchical task-oriented refinement methods offering rich control structures.

Demonstration at the Thirty-First International Conference on Automated Planning and Scheduling (ICAPS 2021)

Integrating Planning and Acting With a Re-Entrant HTN Planner

Yash Bansod, Dana Nau, Sunandita Patra, Mark Roberts

A major problem with integrating HTN planning and acting is that, unless the HTN methods are very carefully written, unexpected problems can occur when attempting to replan if execution errors or other unexpected conditions occur during acting. To overcome this problem, we present a re-entrant HTN planning algorithm that can be restarted for replanning purposes at the point where an execution error occurred, and an HTN acting algorithm that can restart the HTN planner at this point. We show through experiments that our algorithm is an improvement over a widely used approach to planning and control.

Proceedings of the 4th ICAPS Workshop on Hierarchical Planning (HPlan 2021)

GTPyhop: A Hierarchical Goal+Task Planner Implemented in Python

Dana Nau, Yash Bansod, Sunandita Patra, Mark Roberts, Ruoxi Li

The Pyhop planner, released in 2013, was a simple SHOP-style planner written in Python. It was designed to be easily usable as an embedded system in conventional applications such as game programs. Although little effort was made to publicize Pyhop, its simplicity, ease of use, and understandability led to its use in a number of projects beyond its original intent, and to publications by others.

GTPyhop (Goal-and-Task Pyhop) is an extended version of Pyhop that can plan for both goals and tasks, using a combination of SHOP-style task decomposition and GDP-style goal decomposition. It provides a totally-ordered version of Goal-Task-Network (GTN) planning without sharing and task insertion. GTPyhop’s ability to represent and reason about both goals and tasks provides a high degree of flexibility for representing objectives in whichever form seems more natural to the domain designer.

Proceedings of the 4th ICAPS Workshop on Hierarchical Planning (HPlan 2021)

Decentralized Refinement Planning and Acting

Ruoxi Li, Sunandita Patra, Dana Nau

The authors describe Dec-RPAE, a system for decentralized multi-agent acting and planning in partially observable and non-deterministic environments. The system includes both an acting component and an online planning component.

Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling (ICAPS 2021)

Deliberative Acting, Planning and Learning with Hierarchical Operational Models

Sunandita Patra, James Mason, Malik Ghallab, Dana Nau, Paolo Traverso

The authors define and implement an integrated acting-and-planning system in which both planning and acting use the same operational models.

Preprint submitted to Artificial Intelligence

Approximating Spatial Evolutionary Games using Bayesian Networks

Vincent Hsiao, Xinyue Pan, Dana Nau, Rina Dechter

The authors define a framework for modeling spatial evolutionary games using Dynamic Bayesian Networks that capture the underlying stochastic process. The resulting Dynamic Bayesian Networks can be queried for quantities of interest by performing exact inference on the network. They then propose a method for producing approximations of the spatial evolutionary game through the truncation of the corresponding DBN, taking advantage of the high symmetry of the model.

Proceedings of the ACM 20th International Conference on Autonomous Agents and MultiAgent Systems

The relationship between cultural tightness–looseness and COVID-19 cases and deaths: a global analysis

Michele Gelfand, Joshua Jackson, Xinyue Pan, Dana Nau, Dylan Pieper, Emmy Denison, Munqith Dagher, Paul Van Lange, Chi-Yue Chiu, Mo Wang

The COVID-19 pandemic is a global health crisis, yet certain countries have had far more success in limiting COVID-19 cases and deaths. The authors suggest that collective threats require a tremendous amount of coordination, and that strict adherence to social norms is a key mechanism that enables groups to do so. The paper examines how the strength of social norms—or cultural tightness–looseness—was associated with countries' success in limiting cases and deaths. The results indicated that, compared with nations with high levels of cultural tightness, nations with high levels of cultural looseness are estimated to have had 4·99 times the number of cases (7132 per million vs 1428 per million, respectively) and 8·71 times the number of deaths (183 per million vs 21 per million, respectively), taking into account a number of controls. A formal evolutionary game theoretic model suggested that tight groups coordinate much faster and have higher survival rates than loose groups. The results suggest that tightening social norms might confer an evolutionary advantage in times of collective threat.

The Lancet Planetary Health

2020

Decentralized Acting and Planning Using Hierarchical Operational Models

Ruoxi Li, Sunandita Patra,Dana Nau

The paper describes Dec-RAE-UPOM, a system for decentralized multi-agent acting and planning in environments that are partially observable, nondeterministic, and dynamically changing.

Dr. Nau's Computer Science paper archives

Integrating acting, planning, and learning in hierarchical operational models

Sunandita Patra, Amit Kumar, James Mason, Malik Ghallab, Paolo Traverso, Dana Nau

New planning and learning algorithms for Refinement Acting Engine (RAE), which uses hierarchical operational models to perform tasks in dynamically changing environments. 

2020 International Conference on Automated Planning and Scheduling (ICAPS)

2019

APE: Acting and Planning Engine

Sudanita Patra, Malik Ghallab, Dana Nau, Paolo Traverso

An integrated acting and planning system that addresses the consistency problem by using the actor’s operational models both for acting and for planning.

2020

Metamorphic filtering of black-box adversarial attacks onmulti-network face recognition models

Rohan Mekala, Adam Porter, Mikael Lindvall

The authors build a black box attack against robust multi-model face recognition pipelines and test it against Google’s FaceNet. They present a novel metamorphic defense pipeline relying on nonlinear image transformations to detect adversarial attacks with a high degree of accuracy. They further use the results to create probabilistic metamorphic relations that define efficient decision boundaries between safe and adversarial examples.

ICSEW'20 May 2020, Seoul, South Korea

2020

A Machine Learning based Approximate Computing Approach on Data Flow Graphs: Work-in-Progress

Ye Wang, Jian Dong, Yanxin Liu, Chunpei Wang, Gang Qu

A report on ongoing work towards a machine learning based runtime approximate computing (AC) approach that can be applied on the data flow graph representation of any software program. This approach can utilize runtime inputs together with prior information of the software to identify and approximate the noncritical portion of a computation with low runtime overhead. Some preliminary experimental results show that compared with previous runtime AC approaches, our approach can significantly reduce the time overhead with little loss on the energy efficiency and computation accuracy.

2020 International Conference on Embedded Software (EMSOFT)

A New Aging Sensor for the Detection of Recycled ICs

Zhichao Xu, Aijiao Cui, Gang Qu

The electronics industry has become the main target of counterfeiting. Integrated circuits (ICs) are highly vulnerable to various types of counterfeiting such as recycling. The recycled ICs do not have the performance and service lifetime of the genuine ones, which poses a threat to reliability of electronic systems. This paper proposes a novel recycled IC detection method. An authentication mechanism and a parallel circuit unit structures, as an aging sensor, are used to distinguish recycled ICs from fresh ICs.

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

Is It Approximate Computing or Malicious Computing?

Ye Wang, Jian Dong, Qian Xu, Zhaojun Lu, Gang Qu

Approximate computing (AC) is an attractive energy efficient technique that can be implemented at almost all the design levels including data, algorithm, and hardware. The basic idea behind AC is to deliberately control the trade-off between computation accuracy and energy efficiency. However, with the introduction of AC, traditional computing frameworks are having many potential security vulnerabilities. This paper analyzes these vulnerabilities and the associated attacks as well as corresponding countermeasures.

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

Privacy Threats and Protection in Machine Learning

Jiliang Zhang, Chen Li, Jing Ye, Gang Qu

This article reviews recent research progress on machine learning privacy. First, the privacy threats on data and models in different scenarios are described in detail. Then, typical privacy protection methods are introduced. Finally, the limitations and future development trends of ML privacy research are discussed.

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

2019

Research on the impact of different benchmark circuits on the representative path in FPGAs

Jiqing Xu, Zhengjie Li , Yunbing Pang , Jian Wang , Gang Qu, Jinmei Lai

Under the premise of selecting a large number of typical benchmark circuits, a representative path delay can well represent the overall timing performance of FPGAs.

2019 IEEE 13th International Conference on ASIC (ASICON)

2021

Human-Centered AI: A New Synthesis

Ben Shneiderman

Researchers, developers, business leaders, policy makers and others are expanding the technology-centered scope of Artificial Intelligence (AI) to include Human-Centered AI (HCAI) ways of thinking. This expansion from an algorithm-focused view to embrace a human-centered perspective, can shape the future of technology so as to better serve human needs. Educators, designers, software engineers, product managers, evaluators, and government agency staffers can build on AI-driven technologies to design products and services that make life better for the users. These human-centered products and services will enable people to better care for each other, build sustainable communities, and restore the environment.

INTERACT 2021, the IFIP Conference on Human-Computer Interaction

Artificial Intelligence for Humankind: A Panel on How to Create Truly Interactive and Human-Centered AI for the Benefit of Individuals and Society

Albrecht Schmidt, Fosca Giannotti, Wendy Mackay, Ben Shneiderman, Kaisa Väänänen

This panel discusses the role of human-computer interaction (HCI) in the conception, design, and implementation of human-centered artificial intelligence (AI). For us, it is important that AI and machine learning (ML) are ethical and create value for humans - as individuals as well as for society. Our discussion emphasizes the opportunities of using HCI and User Experience Design methods to create advanced AI/ML-based systems that will be widely adopted, reliable, safe, trustworthy, and responsible. The resulting systems will integrate AI and ML algorithms while providing user interfaces and control panels that ensure meaningful human control.

INTERACT 2021, the IFIP Conference on Human-Computer Interaction

Designing AI to Work WITH or FOR People?

Dakuo Wang, Pattie Maes, Xiangshi Ren, Ben Shneiderman, Yuanchun Shi, Qianying Wang

Artificial Intelligence (AI) can refer to the machine learning algorithms and the automation applications built on top of these algorithms. Human-computer interaction (HCI) researchers have studied these AI applications and suggested various Human-Centered AI (HCAI) principles for an explainable, safe, reliable, and trustworthy interaction experience. While some designers believe that computers should be supertools and active appliances, others believe that these latest AI systems can be collaborators. We ask whether the supertool or the collaboration metaphors best support work and play? How can we design AI systems to work best with people or for people? What does it take to get there?

2021 ACM CHI Conference on Human Factors in Computing Systems

Tutorial on Human-Centered AI: Reliable, Safe and Trustworthy

Ben Shneiderman

Shneiderman's tutorial proposes a new synthesis, in which Artificial Intelligence (AI) algorithms are combined with human-centered thinking to make Human-Centered AI (HCAI). This approach combines research on AI algorithms with user experience design methods to shape technologies that amplify, augment, empower, and enhance human performance. Researchers and developers for HCAI systems value meaningful human control, putting people first by serving human needs, values, and goals.

26th ACM International Conference on Intelligent User Interfaces

Universal Usability: A Grand Challenge for HCI

Ben Shneiderman

This is a 2006 draft position paper for the emerging science of the web. Readers will gain hope for the future by reading how government services and digital libraries are being redesigned to make them more usable for diverse users. There is a good taste of the breadth of research being done: not only for the diversity of users and their special needs, but for the research methods and outcomes. The breadth of these implications highlights why universal usability research is so important. There is progress and hope, but there are many minds to be changed and much work to be done.

docsbay.net

2020

Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems

Ben Shneiderman

This paper bridges the gap between widely discussed ethical principles of Human-centered AI(HCAI) and practical steps for effective governance.

ACM Transactions on Interactive Intelligent Systems

Human-Centered Artificial Intelligence: Three Fresh Ideas

Ben Shneiderman

A commentary that reverses the current emphasis on algorithms and AI methods, by putting humans at the center of systems design thinking. It offers three ideas: (1) a two-dimensional HCAI framework, which shows how it is possible to have both high levels of human control AND high levels of automation, (2) a shift from emulating humans to empowering people with a plea to shift language, imagery, and metaphors away from portrayals of intelligent autonomous teammates towards descriptions of powerful tool-like appliances and tele-operated devices, and (3) a three-level governance structure that describes how software engineering teams can develop more reliable systems, how managers can emphasize a safety culture across an organization, and how industry-wide certification can promote trustworthy HCAI systems.

AIS Transactions on Human-Computer Interaction

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

Ben Shneiderman

Proposes a two-dimensional framework alternative to autonomous AI systems called Human-Centered Artificial Intelligence that clarifies how to design for high levels of human control and high levels of computer automation to increase human performance, understand the situations in which full human control or full computer control are necessary, and avoid the dangers of either excessive human control or excessive computer control.

ACM International Journal of Human-Computer Interaction

2020

Coded Distributed Computing with Partial Recovery

Emre Ozfatura, Sennur Ulukus, Deniz Gunduz

Introduces a novel coded matrix-vector multiplication scheme, called coded computation with partial recovery (CCPR), which benefits from the advantages of both coded and uncoded computation schemes, and reduces both computation time and decoding complexity by allowing a trade-off between the accuracy and the speed of computation. The approach is extended to distributed implementation of more general computation tasks by proposing a coded communication scheme with partial recovery, where the results of subtasks computed by the workers are coded before being communicated.

2019 IEEE International Conference on Acoustics, Speech and Signal Processin; arXiv.org

Age-Based Coded Computation for Bias Reduction in Distributed Learning

Deniz Gunduz, Emre Ozfatura, Sennur Ulukus, Baturalp Buyukates

The age of information (AoI) metric is used to track the recovery frequency of partial computations in distributed gradient descent, the most common approach in supervised machine learning, a new solution to the problem of “straggling” worker machines.

arXiv.org

Channel-Aware Adversarial Attacks against Deep Learning-based Wireless Signal Classifiers

Brian Kim, Yalin E. Sagduyu, Kemal Davaslioglu, Tugba Erpek, Sennur Ulukus

Presents over-the-air adversarial attacks against deep learning-based modulation classifiers, accounting for realistic channel and broadcast transmission effects. A certified defense method using randomized smoothing is also included.

arXiv.org

2021

Detecting and Counting Oysters

Behzad Sadrfaridpour, Yiannis Aloimonos, Miao Yu, Yang Tao, Donald Webster

To test the idea that advancements in robotics and artificial intelligence offer the potential to improve the monitoring of oyster beds, the researchers prepared a remote operated underwater vehicle (ROV) with a camera and filmed in the Chesapeake Bay. They then used these videos to train convolutional neural networks (CNNs) to count oysters and track them in consecutive image frames so they are not identified multiple times.

arXiv.org


Top