Articles from 2019-12-24

45 new data science research articles were published on 2019-12-24. 23 discussed machine learning.

Bryan Whiting https://www.bryanwhiting.com
2019-12-27

Table of Contents


Breakdown of arXiv Publication Counts

Yesterday’s counts of submitted papers on www.arxiv.org grouped by primary subject. Click the links in the table to be re-directed to the abstracts below. The links under Subject will redirect you to abstracts with the primary subject (there can only be one primary subject on arXiv). The links under Category will redirect you to all publications yesterday with a given tag (primary or secondary).

Table 1: Number of articles by subject and primary category. Colored titles represent hyperlinks that take you below to abstracts. Key - Subject: Computer Science (5) means there were 5 articles with primary tag CS. Category: Machine Learning (cs.LG) N = 8 (16) means there were 8 primary articles with the (cs.LG) tag but 16 articles had it as a secondary tag, so there should be 24 in total. Click this link to be taken to all 24. Only select categories are highlighted because they are of particular interest to applied data scientists.
Subject Category N
Computer Science (30) Computer Vision and Pattern Recognition (cs.CV) 10 (3)
Machine Learning (cs.LG) 9 (11)
Artificial Intelligence (cs.AI) 3 (1)
Cryptography and Security (cs.CR) 2 (1)
Neural and Evolutionary Computing (cs.NE) 2
Sound (cs.SD) 1 (1)
Computational Engineering, Finance, and Science (cs.CE) 1
Social and Information Networks (cs.SI) 1
Software Engineering (cs.SE) 1
Statistics (4) Methodology (stat.ME) 2 (1)
Machine Learning (stat.ML) 1 (12)
Applications (stat.AP) 1 (1)
Elec. Eng. and Systems Science (2) Image and Video Processing (eess.IV) 1 (2)
Signal Processing (eess.SP) 1
Other (2) Astrophysics of Galaxies (astro-ph.GA) 1
High Energy Astrophysical Phenomena (astro-ph.HE) 1
Quantitative Finance (2) Risk Management (q-fin.RM) 1
Statistical Finance (q-fin.ST) 1
Quantum Physics (2) Quantum Physics (quant-ph) 2
Mathematics (1) Statistics Theory (math.ST) 1
Physics (1) Computational Physics (physics.comp-ph) 1 (2)
Quantitative Biology (1) Quantitative Methods (q-bio.QM) 1

Articles for Statitstics, Machine Learning Econonmetrics, and Finance

This section contains all articles with any tag of stat.AP, stat.co, stat.ML, cs.LG, q-fin.ST, q-fin.EC, or econ-EM. Only the first two sentences are shown - click the links for more detail.

Applications (stat.AP): 2 new

Applications (stat.AP)
Aggregating predictions from experts: a scoping review of statistical methods, experiments, and applications
Applications. 4 authors. pdf
Forecasts support decision making in a variety of applications. Statistical models can produce accurate forecasts given abundant training data, but when data is sparse, rapidly changing, or unavailable, statistical models may not be able to make accurate predictions. …
Expert judgmental forecasts—models that combine expert-generated predictions into a single forecast—can make predictions when training data is limited by relying on expert intuition to take the place of concrete training data. Researchers have proposed a wide array of algorithms to combine expert predictions into a single forecast, but there is no consensus on an optimal aggregation model. This scoping review surveyed recent literature on aggregating expert-elicited predictions. We gathered common terminology, aggregation methods, and forecasting performance metrics, and offer guidance to strengthen future work that is growing at an accelerated pace.
Online Quantification of Input Model Uncertainty by Two-Layer Importance Sampling
Applications, Risk Management. 2 authors. pdf
Stochastic simulation has been widely used to analyze the performance of complex stochastic systems and facilitate decision making in those systems. Stochastic simulation is driven by the input model, which is a collection of probability distributions that model the stochasticity in the system. …
The input model is usually estimated using a finite amount of data, which introduces the so-called input model uncertainty (or, input uncertainty for short) to the simulation output. How to quantify input uncertainty has been studied extensively, and many methods have been proposed for the batch data setting, i.e., when all the data are available at once. However, methods for ``streaming data’’ arriving sequentially in time are still in demand, despite that streaming data have become increasingly prevalent in modern applications. To fill in this gap, we propose a two-layer importance sampling framework that incorporates streaming data for online input uncertainty quantification. Under this framework, we develop two algorithms that suit two different application scenarios: the first is when data come at a fast speed and there is no time for any simulation in between updates; the second is when data come at a moderate speed and a few but limited simulations are allowed at each time stage. We show the consistency and asymptotic convergence rate results, which theoretically show the efficiency of our proposed approach. We further demonstrate the proposed algorithms on an example of the news vendor problem.

Machine Learning (stat.ML): 13 new

Machine Learning (stat.ML)
Fast and deep neuromorphic learning with time-to-first-spike coding
Neurons and Cognition, Neural and Evolutionary Computing, Machine Learning, Emerging Technologies. 11 authors. pdf
For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems also strive for short time-to-solution and low energy-to-solution characteristics. …
At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. In the time-to-first-spike coding framework, both of these goals are inherently emerging features of learning. Here, we describe a rigorous derivation of error-backpropagation-based learning for hierarchical networks of leaky integrate-and-fire neurons. We explicitly address two issues that are relevant for both biological plausibility and applicability to neuromorphic substrates by incorporating dynamics with finite time constants and by optimizing the backward pass with respect to substrate variability. This narrows the gap between previous models of first-spike-time learning and biological neuronal dynamics, thereby also enabling fast and energy-efficient inference on analog neuromorphic devices that inherit these dynamics from their biological archetypes, which we demonstrate on two generations of the BrainScaleS analog neuromorphic architecture.
Broad Learning System Based on Maximum Correntropy Criterion
Machine Learning, Machine Learning. 10 authors. pdf
As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. …
To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods.
TRADI: Tracking deep neural network weight distributions
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 5 authors. pdf
During training, the weights of a Deep Neural Network (DNN) are optimized from a random initialization towards a nearly optimum value minimizing a loss function. Only this final state of the weights is typically kept for testing, while the wealth of information on the geometry of the weight space, accumulated over the descent towards the minimum is discarded. …
In this work we propose to make use of this knowledge and leverage it for computing the distributions of the weights of the DNN. This can be further used for estimating the epistemic uncertainty of the DNN by sampling an ensemble of networks from these distributions. To this end we introduce a method for tracking the trajectory of the weights during optimization, that does not require any changes in the architecture nor on the training procedure. We evaluate our method on standard classification and regression benchmarks, and on out-of-distribution detection for classification and semantic segmentation. We achieve competitive results, while preserving computational efficiency in comparison to other popular approaches.
Attention-Aware Answers of the Crowd
Machine Learning, Machine Learning. 5 authors. pdf
Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated data needs to be further maintained. …
Various solutions have been attempted to obtain high-quality annotations. However, they all assume that workers’ label quality is stable over time (always at the same level whenever they conduct the tasks). In practice, workers’ attention level changes over time, and the ignorance of which can affect the reliability of the annotations. In this paper, we focus on a novel and realistic crowdsourcing scenario involving attention-aware annotations. We propose a new probabilistic model that takes into account workers’ attention to estimate the label quality. Expectation propagation is adopted for efficient Bayesian inference of our model, and a generalized Expectation Maximization algorithm is derived to estimate both the ground truth of all tasks and the label-quality of each individual crowd worker with attention. In addition, the number of tasks best suited for a worker is estimated according to changes in attention. Experiments against related methods on three real-world and one semi-simulated datasets demonstrate that our method quantifies the relationship between workers’ attention and label-quality on the given tasks, and improves the aggregated labels.
Attack-Resistant Federated Learning with Residual-based Reweighting
Machine Learning, Machine Learning. 4 authors. pdf
Federated learning has a variety of applications in multiple domains by utilizing private training data stored on different devices. However, the aggregation process in federated learning is highly vulnerable to adversarial attacks so that the global model may behave abnormally under attacks. …
To tackle this challenge, we present a novel aggregation algorithm with residual-based reweighting to defend federated learning. Our aggregation algorithm combines repeated median regression with the reweighting scheme in iteratively reweighted least squares. Our experiments show that our aggregation algorithm outperforms other alternative algorithms in the presence of label-flipping, backdoor, and Gaussian noise attacks. We also provide theoretical guarantees for our aggregation algorithm.
Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer
Machine Learning, Machine Learning, Cryptography and Security. 4 authors. pdf
Collaborative (federated) learning enables multiple parties to train a model without sharing their private data, but through repeated sharing of the parameters of their local models. Despite its advantages, this approach has many known privacy and security weaknesses and performance overhead, in addition to being limited only to models with homogeneous architectures. …
Shared parameters leak a significant amount of information about the local (and supposedly private) datasets. Besides, federated learning is severely vulnerable to poisoning attacks, where some participants can adversarially influence the aggregate parameters. Large models, with high dimensional parameter vectors, are in particular highly susceptible to privacy and security attacks: curse of dimensionality in federated learning. We argue that sharing parameters is the most naive way of information exchange in collaborative learning, as they open all the internal state of the model to inference attacks, and maximize the model’s malleability by stealthy poisoning attacks. We propose Cronus, a robust collaborative machine learning framework. The simple yet effective idea behind designing Cronus is to control, unify, and significantly reduce the dimensions of the exchanged information between parties, through robust knowledge transfer between their black-box local models. We evaluate all existing federated learning algorithms against poisoning attacks, and we show that Cronus is the only secure method, due to its tight robustness guarantee. Treating local models as black-box, reduces the information leakage through models, and enables us using existing privacy-preserving algorithms that mitigate the risk of information leakage through the model’s output (predictions). Cronus also has a significantly lower sample complexity, compared to federated learning, which does not bind its security to the number of participants.
Mining User Behaviour from Smartphone data, a literature review
Machine Learning, Machine Learning. 4 authors. pdf
To study users’ travel behaviour and travel time between origin and destination, researchers employ travel surveys. Although there is consensus in the field about the potential, after over ten years of research and field experimentation, Smartphone-based travel surveys still did not take off to a large scale. …
Here, computer intelligence algorithms take the role that operators have in Traditional Travel Surveys; since we train each algorithm on data, performances rest on the data quality, thus on the ground truth. Inaccurate validations affect negatively: labels, algorithms’ training, travel diaries precision, and therefore data validation, within a very critical loop. Interestingly, boundaries are proven burdensome to push even for Machine Learning methods. To support optimal investment decisions for practitioners, we expose the drivers they should consider when assessing what they need against what they get. This paper highlights and examines the critical aspects of the underlying research and provides some recommendations: (i) from the device perspective, on the main physical limitations; (ii) from the application perspective, the methodological framework deployed for the automatic generation of travel diaries; (iii)from the ground truth perspective, the relationship between user interaction, methods, and data.
Meta-Learning PAC-Bayes Priors in Model Averaging
Machine Learning, Machine Learning. 4 authors. pdf
Nowadays model uncertainty has become one of the most important problems in both academia and industry. In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model’s uncertainty to improve reliability and accuracy of inferences. …
Here one main challenge is to learn the prior over the model set. To tackle this problem, we propose two data-based algorithms to get proper priors for model averaging. One is for meta-learner, the analysts should use historical similar tasks to extract the information about the prior. The other one is for base-learner, a subsampling method is used to deal with the data step by step. Theoretically, an upper bound of risk for our algorithm is presented to guarantee the performance of the worst situation. In practice, both methods perform well in simulations and real data studies, especially with poor quality data.
Universal Inference Using the Split Likelihood Ratio Test
Statistics Theory, Machine Learning, Statistics Theory, Methodology. 3 authors. pdf
We propose a general method for constructing hypothesis tests and confidence sets that have finite sample guarantees without regularity conditions. We refer to such procedures as universal. ...</summary><br>'' The method is very simple and is based on a modified version of the usual likelihood ratio statistic, that we callthe split likelihood ratio test’’ (split LRT). The method is especially appealing for irregular statistical models. Canonical examples include mixture models and models that arise in shape-constrained inference. %mixture models and shape-constrained models are just two examples. Constructing tests and confidence sets for such models is notoriously difficult. Typical inference methods, like the likelihood ratio test, are not useful in these cases because they have intractable limiting distributions. In contrast, the method we suggest works for any parametric model and also for some nonparametric models. The split LRT can also be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid \(p\)-values and confidence sequences.
Characterizing the Decision Boundary of Deep Neural Networks
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 3 authors. pdf
Deep neural networks and in particular, deep neural classifiers have become an integral part of many modern applications. Despite their practical success, we still have limited knowledge of how they work and the demand for such an understanding is evergrowing. …
In this regard, one crucial aspect of deep neural network classifiers that can help us deepen our knowledge about their decision-making behavior is to investigate their decision boundaries. Nevertheless, this is contingent upon having access to samples populating the areas near the decision boundary. To achieve this, we propose a novel approach we call Deep Decision boundary Instance Generation (DeepDIG). DeepDIG utilizes a method based on adversarial example generation as an effective way of generating samples near the decision boundary of any deep neural network model. Then, we introduce a set of important principled characteristics that take advantage of the generated instances near the decision boundary to provide multifaceted understandings of deep neural networks. We have performed extensive experiments on multiple representative datasets across various deep neural network models and characterized their decision boundaries.
Detection of Community Structures in Networks with Nodal Features based on Generative Probabilistic Approach
Social and Information Networks, Machine Learning. 3 authors. pdf
Community detection is considered as a fundamental task in analyzing social networks. Even though many techniques have been proposed for community detection, most of them are based exclusively on the connectivity structures. …
However, there are node features in real networks, such as gender types in social networks, feeding behavior in ecological networks, and location on e-trading networks, that can be further leveraged with the network structure to attain more accurate community detection methods. We propose a novel probabilistic graphical model to detect communities by taking into account both network structure and nodes’ features. The proposed approach learns the relevant features of communities through a generative probabilistic model without any prior assumption on the communities. Furthermore, the model is capable of determining the strength of node features and structural elements of the networks on shaping the communities. The effectiveness of the proposed approach over the state-of-the-art algorithms is revealed on synthetic and benchmark networks.
Online Algorithms for Multiclass Classification using Partial Labels
Machine Learning, Machine Learning. 2 authors. pdf
In this paper, we propose online algorithms for multiclass classification using partial labels. We propose two variants of Perceptron called Avg Perceptron and Max Perceptron to deal with the partial labeled data. …
We also propose Avg Pegasos and Max Pegasos, which are extensions of Pegasos algorithm. We also provide mistake bounds for Avg Perceptron and regret bound for Avg Pegasos. We show the effectiveness of the proposed approaches by experimenting on various datasets and comparing them with the standard Perceptron and Pegasos.
mRMR-DNN with Transfer Learning for IntelligentFault Diagnosis of Rotating Machines
Machine Learning, Machine Learning. 2 authors. pdf
In recent years, intelligent condition-based monitoring of rotary machinery systems has become a major research focus of machine fault diagnosis. In condition-based monitoring, it is challenging to form a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation. …
Along with that, the generated data have a large number of redundant features which degraded the performance of the machine learning models. To overcome this, we have utilized the advantages of minimum redundancy maximum relevance (mRMR) and transfer learning with deep learning model. In this work, mRMR is combined with deep learning and deep transfer learning framework to improve the fault diagnostics performance in term of accuracy and computational complexity. The mRMR reduces the redundant information from data and increases the deep learning performance, whereas transfer learning, reduces a large amount of data dependency for training the model. In the proposed work, two frameworks, i.e., mRMR with deep learning and mRMR with deep transfer learning, have explored and validated on CWRU and IMS rolling element bearings datasets. The analysis shows that the proposed frameworks are able to obtain better diagnostic accuracy in comparison of existing methods and also able to handle the data with a large number of features more quickly.

Machine Learning (cs.LG): 20 new

Machine Learning (cs.LG)
Broad Learning System Based on Maximum Correntropy Criterion
Machine Learning, Machine Learning. 10 authors. pdf
As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. …
To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods.
TF3P: Three-dimensional Force Fields Fingerprint Learned by Deep Capsular Network
Biomolecules, Machine Learning, Quantitative Methods. 8 authors. pdf
Molecular fingerprints are the workhorse in ligand-based drug discovery. In recent years, increasing number of research papers reported fascinating results on using deep neural networks to learn 2D molecular representations as fingerprints. …
One may anticipate that the integration of deep learning would also contribute to the prosperity of 3D fingerprints. Here, we presented a new 3D small molecule fingerprint, the three-dimensional force fields fingerprint (TF3P), learned by deep capsular network whose training is in no need of labeled dataset for specific predictive tasks. TF3P can encode the 3D force fields information of molecules and demonstrates its stronger ability to capture 3D structural changes, recognize molecules alike in 3D but not in 2D, and recognize similar targets inaccessible by other fingerprints, including the solely existing 3D fingerprint E3FP, based on only ligands similarity. Furthermore, TF3P is compatible with both statistical models (e.g. similarity ensemble approach) and machine learning models. Altogether, we report TF3P as a new 3D small molecule fingerprint with promising future in ligand-based drug discovery.
Large Scale Learning of General Visual Representations for Transfer
Machine Learning, Computer Vision and Pattern Recognition. 7 authors. pdf
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the weights on the target task. …
We scale up pre-training, and create a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes - from 10 to 1M labeled examples. BiT achieves 87.8% top-1 accuracy on ILSVRC-2012, 99.3% on CIFAR-10, and 76.7% on the Visual Task Adaptation Benchmark (which includes 19 tasks). On small datasets, BiT attains 86.4% on ILSVRC-2012 with 25 examples per class, and 97.6% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.
Computation Reallocation for Object Detection
Machine Learning, Computer Vision and Pattern Recognition. 7 authors. pdf
The allocation of computation resources in the backbone is a crucial issue in object detection. However, classification allocation pattern is usually adopted directly to object detector, which is proved to be sub-optimal. …
In order to reallocate the engaged computation resources in a more efficient way, we present CR-NAS (Computation Reallocation Neural Architecture Search) that can learn computation reallocation strategies across different feature resolution and spatial position diectly on the target detection dataset. A two-level reallocation space is proposed for both stage and spatial reallocation. A novel hierarchical search procedure is adopted to cope with the complex search space. We apply CR-NAS to multiple backbones and achieve consistent improvements. Our CR-ResNet50 and CR-MobileNetV2 outperforms the baseline by 1.9% and 1.7% COCO AP respectively without any additional computation budget. The models discovered by CR-NAS can be equiped to other powerful detection neck/head and be easily transferred to other dataset, e.g. PASCAL VOC, and other vision tasks, e.g. instance segmentation. Our CR-NAS can be used as a plugin to improve the performance of various networks, which is demanding.
Audio-based automatic mating success prediction of giant pandas
Audio and Speech Processing, Machine Learning, Sound. 7 authors. pdf
Giant pandas, stereotyped as silent animals, make significantly more vocal sounds during breeding season, suggesting that sounds are essential for coordinating their reproduction and expression of mating preference. Previous biological studies have also proven that giant panda sounds are correlated with mating results and reproduction. …
This paper makes the first attempt to devise an automatic method for predicting mating success of giant pandas based on their vocal sounds. Given an audio sequence of mating giant pandas recorded during breeding encounters, we first crop out the segments with vocal sound of giant pandas, and normalize its magnitude, and length. We then extract acoustic features from the audio segment and feed the features into a deep neural network, which classifies the mating into success or failure. The proposed deep neural network employs convolution layers followed by bidirection gated recurrent units to extract vocal features, and applies attention mechanism to force the network to focus on most relevant features. Evaluation experiments on a data set collected during the past nine years obtain promising results, proving the potential of audio-based automatic mating success prediction methods in assisting giant panda reproduction.
TRADI: Tracking deep neural network weight distributions
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 5 authors. pdf
During training, the weights of a Deep Neural Network (DNN) are optimized from a random initialization towards a nearly optimum value minimizing a loss function. Only this final state of the weights is typically kept for testing, while the wealth of information on the geometry of the weight space, accumulated over the descent towards the minimum is discarded. …
In this work we propose to make use of this knowledge and leverage it for computing the distributions of the weights of the DNN. This can be further used for estimating the epistemic uncertainty of the DNN by sampling an ensemble of networks from these distributions. To this end we introduce a method for tracking the trajectory of the weights during optimization, that does not require any changes in the architecture nor on the training procedure. We evaluate our method on standard classification and regression benchmarks, and on out-of-distribution detection for classification and semantic segmentation. We achieve competitive results, while preserving computational efficiency in comparison to other popular approaches.
Attention-Aware Answers of the Crowd
Machine Learning, Machine Learning. 5 authors. pdf
Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated data needs to be further maintained. …
Various solutions have been attempted to obtain high-quality annotations. However, they all assume that workers’ label quality is stable over time (always at the same level whenever they conduct the tasks). In practice, workers’ attention level changes over time, and the ignorance of which can affect the reliability of the annotations. In this paper, we focus on a novel and realistic crowdsourcing scenario involving attention-aware annotations. We propose a new probabilistic model that takes into account workers’ attention to estimate the label quality. Expectation propagation is adopted for efficient Bayesian inference of our model, and a generalized Expectation Maximization algorithm is derived to estimate both the ground truth of all tasks and the label-quality of each individual crowd worker with attention. In addition, the number of tasks best suited for a worker is estimated according to changes in attention. Experiments against related methods on three real-world and one semi-simulated datasets demonstrate that our method quantifies the relationship between workers’ attention and label-quality on the given tasks, and improves the aggregated labels.
ADD-Lib: Decision Diagrams in Practice
Machine Learning, Programming Languages, Artificial Intelligence, Software Engineering. 4 authors. pdf
In the paper, we present the ADD-Lib, our efficient and easy to use framework for Algebraic Decision Diagrams (ADDs). The focus of the ADD-Lib is not so much on its efficient implementation of individual operations, which are taken by other established ADD frameworks, but its ease and flexibility, which arise at two levels: the level of individual ADD-tools, which come with a dedicated user-friendly web-based graphical user interface, and at the meta level, where such tools are specified. …
Both levels are described in the paper: the meta level by explaining how we can construct an ADD-tool tailored for Random Forest refinement and evaluation, and the accordingly generated Web-based domain-specific tool, which we also provide as an artifact for cooperative experimentation. In particular, the artifact allows readers to combine a given Random Forest with their own ADDs regarded as expert knowledge and to experience the corresponding effect.
Attack-Resistant Federated Learning with Residual-based Reweighting
Machine Learning, Machine Learning. 4 authors. pdf
Federated learning has a variety of applications in multiple domains by utilizing private training data stored on different devices. However, the aggregation process in federated learning is highly vulnerable to adversarial attacks so that the global model may behave abnormally under attacks. …
To tackle this challenge, we present a novel aggregation algorithm with residual-based reweighting to defend federated learning. Our aggregation algorithm combines repeated median regression with the reweighting scheme in iteratively reweighted least squares. Our experiments show that our aggregation algorithm outperforms other alternative algorithms in the presence of label-flipping, backdoor, and Gaussian noise attacks. We also provide theoretical guarantees for our aggregation algorithm.
Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer
Machine Learning, Machine Learning, Cryptography and Security. 4 authors. pdf
Collaborative (federated) learning enables multiple parties to train a model without sharing their private data, but through repeated sharing of the parameters of their local models. Despite its advantages, this approach has many known privacy and security weaknesses and performance overhead, in addition to being limited only to models with homogeneous architectures. …
Shared parameters leak a significant amount of information about the local (and supposedly private) datasets. Besides, federated learning is severely vulnerable to poisoning attacks, where some participants can adversarially influence the aggregate parameters. Large models, with high dimensional parameter vectors, are in particular highly susceptible to privacy and security attacks: curse of dimensionality in federated learning. We argue that sharing parameters is the most naive way of information exchange in collaborative learning, as they open all the internal state of the model to inference attacks, and maximize the model’s malleability by stealthy poisoning attacks. We propose Cronus, a robust collaborative machine learning framework. The simple yet effective idea behind designing Cronus is to control, unify, and significantly reduce the dimensions of the exchanged information between parties, through robust knowledge transfer between their black-box local models. We evaluate all existing federated learning algorithms against poisoning attacks, and we show that Cronus is the only secure method, due to its tight robustness guarantee. Treating local models as black-box, reduces the information leakage through models, and enables us using existing privacy-preserving algorithms that mitigate the risk of information leakage through the model’s output (predictions). Cronus also has a significantly lower sample complexity, compared to federated learning, which does not bind its security to the number of participants.
Mining User Behaviour from Smartphone data, a literature review
Machine Learning, Machine Learning. 4 authors. pdf
To study users’ travel behaviour and travel time between origin and destination, researchers employ travel surveys. Although there is consensus in the field about the potential, after over ten years of research and field experimentation, Smartphone-based travel surveys still did not take off to a large scale. …
Here, computer intelligence algorithms take the role that operators have in Traditional Travel Surveys; since we train each algorithm on data, performances rest on the data quality, thus on the ground truth. Inaccurate validations affect negatively: labels, algorithms’ training, travel diaries precision, and therefore data validation, within a very critical loop. Interestingly, boundaries are proven burdensome to push even for Machine Learning methods. To support optimal investment decisions for practitioners, we expose the drivers they should consider when assessing what they need against what they get. This paper highlights and examines the critical aspects of the underlying research and provides some recommendations: (i) from the device perspective, on the main physical limitations; (ii) from the application perspective, the methodological framework deployed for the automatic generation of travel diaries; (iii)from the ground truth perspective, the relationship between user interaction, methods, and data.
Meta-Learning PAC-Bayes Priors in Model Averaging
Machine Learning, Machine Learning. 4 authors. pdf
Nowadays model uncertainty has become one of the most important problems in both academia and industry. In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model’s uncertainty to improve reliability and accuracy of inferences. …
Here one main challenge is to learn the prior over the model set. To tackle this problem, we propose two data-based algorithms to get proper priors for model averaging. One is for meta-learner, the analysts should use historical similar tasks to extract the information about the prior. The other one is for base-learner, a subsampling method is used to deal with the data step by step. Theoretically, an upper bound of risk for our algorithm is presented to guarantee the performance of the worst situation. In practice, both methods perform well in simulations and real data studies, especially with poor quality data.
Comparison of the P300 detection accuracy related to the BCI speller and image recognition scenarios
Neurons and Cognition, Human-Computer Interaction, Machine Learning, Signal Processing. 4 authors. pdf
There are several protocols in the Electroencephalography (EEG) recording scenarios which produce various types of event-related potentials (ERP). P300 pattern is a well-known ERP which produced by auditory and visual oddball paradigm and BCI speller system. …
In this study, P300 and non-P300 separability are investigated in two scenarios including image recognition paradigm and BCI speller. Image recognition scenario is an experiment that examines the participants, knowledge about an image that shown to them before by analyzing the EEG signal recorded during the observing of that image as visual stimulation. To do this, three types of famous classifiers (SVM, Bayes LDA, and sparse logistic regression) were used to classify EEG recordings in six classes problem. Filtered and down-sampled (temporal samples) of EEG recording were considered as features in classification P300 pattern. Also, different sets of EEG recording including 4, 8 and 16 channels and different trial numbers were used to considering various situations in comparison. The accuracy was increased by increasing the number of trials and channels. The results prove that better accuracy is observed in the case of the image recognition scenario for the different sets of channels and by using the different number of trials. So it can be concluded that P300 pattern which produced in image recognition paradigm is more separable than BCI (matrix speller).
Assessing differentially private deep learning with Membership Inference
Machine Learning, Cryptography and Security. 4 authors. pdf
Releasing data in the form of trained neural networks with differential privacy promises meaningful anonymization. However, there is an inherent privacy-accuracy trade-off in differential privacy which is challenging to assess for non-privacy experts. …
Furthermore, local and central differential privacy mechanisms are available to either anonymize the training data or the learnt neural network, and the privacy parameter \(\epsilon\) cannot be used to compare these two mechanisms. We propose to measure privacy through a black-box membership inference attack and compare the privacy-accuracy trade-off for different local and central differential privacy mechanisms. Furthermore, we need to evaluate whether differential privacy is a useful mechanism in practice since differential privacy will especially be used by data scientists if membership inference risk is lowered more than accuracy. We experiment with several datasets and show that neither local differential privacy nor central differential privacy yields a consistently better privacy-accuracy trade-off in all cases. We also show that the relative privacy-accuracy trade-off, instead of strictly declining linearly over \(\epsilon\), is only favorable within a small interval. For this purpose we propose \(\varphi\), a ratio expressing the relative privacy-accuracy trade-off.
Characterizing the Decision Boundary of Deep Neural Networks
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 3 authors. pdf
Deep neural networks and in particular, deep neural classifiers have become an integral part of many modern applications. Despite their practical success, we still have limited knowledge of how they work and the demand for such an understanding is evergrowing. …
In this regard, one crucial aspect of deep neural network classifiers that can help us deepen our knowledge about their decision-making behavior is to investigate their decision boundaries. Nevertheless, this is contingent upon having access to samples populating the areas near the decision boundary. To achieve this, we propose a novel approach we call Deep Decision boundary Instance Generation (DeepDIG). DeepDIG utilizes a method based on adversarial example generation as an effective way of generating samples near the decision boundary of any deep neural network model. Then, we introduce a set of important principled characteristics that take advantage of the generated instances near the decision boundary to provide multifaceted understandings of deep neural networks. We have performed extensive experiments on multiple representative datasets across various deep neural network models and characterized their decision boundaries.
Robustness of Brain Tumor Segmentation
Image and Video Processing, Computer Vision and Pattern Recognition, Machine Learning. 3 authors. pdf
We address the generalization behavior of deep neural networks in the context of brain tumor segmentation. While current topologies show an increasingly complex structure, the overall benchmark performance does improve negligibly. …
In our experiments, we demonstrate that a well trained U-Net shows the best generalization behavior and is sufficient to solve this segmentation problem. We illustrate why extensions of this model cannot only be pointless but even harmful in a realistic scenario. Also, we suggest two simple modifications (that do not alter the topology) to further improve its generalization performance.
Multi-Graph Transformer for Free-Hand Sketch Recognition
Machine Learning, Computer Vision and Pattern Recognition. 3 authors. pdf
Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). …
In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel Graph Neural Network (GNN), the Multi-Graph Transformer (MGT), for learning representations of sketches from multiple graphs which simultaneously capture global and local geometric stroke structures, as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: (i) achieves small recognition gap to the CNN-based performance upper bound (72.80% vs. 74.22%), and (ii) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
Online Algorithms for Multiclass Classification using Partial Labels
Machine Learning, Machine Learning. 2 authors. pdf
In this paper, we propose online algorithms for multiclass classification using partial labels. We propose two variants of Perceptron called Avg Perceptron and Max Perceptron to deal with the partial labeled data. …
We also propose Avg Pegasos and Max Pegasos, which are extensions of Pegasos algorithm. We also provide mistake bounds for Avg Perceptron and regret bound for Avg Pegasos. We show the effectiveness of the proposed approaches by experimenting on various datasets and comparing them with the standard Perceptron and Pegasos.
mRMR-DNN with Transfer Learning for IntelligentFault Diagnosis of Rotating Machines
Machine Learning, Machine Learning. 2 authors. pdf
In recent years, intelligent condition-based monitoring of rotary machinery systems has become a major research focus of machine fault diagnosis. In condition-based monitoring, it is challenging to form a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation. …
Along with that, the generated data have a large number of redundant features which degraded the performance of the machine learning models. To overcome this, we have utilized the advantages of minimum redundancy maximum relevance (mRMR) and transfer learning with deep learning model. In this work, mRMR is combined with deep learning and deep transfer learning framework to improve the fault diagnostics performance in term of accuracy and computational complexity. The mRMR reduces the redundant information from data and increases the deep learning performance, whereas transfer learning, reduces a large amount of data dependency for training the model. In the proposed work, two frameworks, i.e., mRMR with deep learning and mRMR with deep transfer learning, have explored and validated on CWRU and IMS rolling element bearings datasets. The analysis shows that the proposed frameworks are able to obtain better diagnostic accuracy in comparison of existing methods and also able to handle the data with a large number of features more quickly.
An Analisys of Application Logs with Splunk : developing an App for the synthetic analysis of data and security incidents
Machine Learning, Cryptography and Security. 1 authors. pdf
The present work aims to enhance the application logs of an hypothetical infrastructure platform, and to build an App that displays the synthetic data about performance, anomalies and security incidents synthesized in the form of a Dashboard. The reference architecture, with multiple applications and multiple HW distribution, implementing a Service Oriented Architecture, is a real case of which the details have been abstracted because we want to extend the concept to all architectures with similar characteristics. …

Statistical Finance (q-fin.ST): 1 new

Statistical Finance (q-fin.ST)
The Dynamics of Financial Markets: Fibonacci numbers, Elliott waves, and solitons
Statistical Finance. 1 authors. pdf
In this paper information theoretical approach is applied to the description of financial markets. A model which is expected to describe the markets dynamics is presented. …
It is shown the possibility to describe market trend and cycle dynamics from a unified viewpoint. The model predictions comparatively well suit Fibonacci ratios and numbers used for the analysis of market price and time projections. It proves possible to link time and price projections, thus allowing increase the accuracy of predicting well in advance the moment of trend termination. The model is tested against real data from the stock and financial markets.

Data Science arXiv by Primary Tag

The tables below show abstracts organized by category with hyperlinks back to the arXiv site.

Computer Science

Computer Vision and Pattern Recognition (cs.CV)
Fast and deep neuromorphic learning with time-to-first-spike coding
Neurons and Cognition, Neural and Evolutionary Computing, Machine Learning, Emerging Technologies. 11 authors. pdf
For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems also strive for short time-to-solution and low energy-to-solution characteristics. …
At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. In the time-to-first-spike coding framework, both of these goals are inherently emerging features of learning. Here, we describe a rigorous derivation of error-backpropagation-based learning for hierarchical networks of leaky integrate-and-fire neurons. We explicitly address two issues that are relevant for both biological plausibility and applicability to neuromorphic substrates by incorporating dynamics with finite time constants and by optimizing the backward pass with respect to substrate variability. This narrows the gap between previous models of first-spike-time learning and biological neuronal dynamics, thereby also enabling fast and energy-efficient inference on analog neuromorphic devices that inherit these dynamics from their biological archetypes, which we demonstrate on two generations of the BrainScaleS analog neuromorphic architecture.
Broad Learning System Based on Maximum Correntropy Criterion
Machine Learning, Machine Learning. 10 authors. pdf
As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. …
To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods.
Audio-Visual Embodied Navigation
Human-Computer Interaction, Computer Vision and Pattern Recognition, Sound, Audio and Speech Processing. 8 authors. pdf
Moving around in the world is naturally a multisensory experience, but today’s embodied agents are deaf - restricted to solely their visual perception of the environment. We introduce audio-visual navigation for complex, acoustically and visually realistic 3D environments. …
By both seeing and hearing, the agent must learn to navigate to an audio-based target. We develop a multi-modal deep reinforcement learning pipeline to train navigation policies end-to-end from a stream of egocentric audio-visual observations, allowing the agent to (1) discover elements of the geometry of the physical space indicated by the reverberating audio and (2) detect and follow sound-emitting targets. We further introduce audio renderings based on geometrical acoustic simulations for a set of publicly available 3D assets and instrument AI-Habitat to support the new sensor, making it possible to insert arbitrary sound sources in an array of apartment, office, and hotel environments. Our results show that audio greatly benefits embodied visual navigation in 3D spaces.
Dense RepPoints: Representing Visual Objects with Dense Point Sets
Computer Vision and Pattern Recognition. 8 authors. pdf
We present an object representation, called , for flexible and detailed modeling of object appearance and geometry. In contrast to the coarse geometric localization and feature extraction of bounding boxes, Dense RepPoints adaptively distributes a dense set of points to semantically and geometrically significant positions on an object, providing informative cues for object analysis. …
Techniques are developed to address challenges related to supervised training for dense point sets from image segments annotations and making this extensive representation computationally practical. In addition, the versatility of this representation is exploited to model object structure over multiple levels of granularity. Dense RepPoints significantly improves performance on geometrically-oriented visual understanding tasks, including a \(1.6\) AP gain in object detection on the challenging COCO benchmark.
Ordered or Orderless: A Revisit for Video based Person Re-Identification
Computer Vision and Pattern Recognition. 8 authors. pdf
Is recurrent network really necessary for learning a good visual representation for video based person re-identification (VPRe-id)? In this paper, we first show that the common practice of employing recurrent neural networks (RNNs) to aggregate temporal spatial features may not be optimal. Specifically, with a diagnostic analysis, we show that the recurrent structure may not be effective to learn temporal dependencies than what we expected and implicitly yields an orderless representation. …
Based on this observation, we then present a simple yet surprisingly powerful approach for VPRe-id, where we treat VPRe-id as an efficient orderless ensemble of image based person re-identification problem. More specifically, we divide videos into individual images and re-identify person with ensemble of image based rankers. Under the i.i.d. assumption, we provide an error bound that sheds light upon how could we improve VPRe-id. Our work also presents a promising way to bridge the gap between video and image based person re-identification. Comprehensive experimental evaluations demonstrate that the proposed solution achieves state-of-the-art performances on multiple widely used datasets (iLIDS-VID, PRID 2011, and MARS).
Large Scale Learning of General Visual Representations for Transfer
Machine Learning, Computer Vision and Pattern Recognition. 7 authors. pdf
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the weights on the target task. …
We scale up pre-training, and create a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes - from 10 to 1M labeled examples. BiT achieves 87.8% top-1 accuracy on ILSVRC-2012, 99.3% on CIFAR-10, and 76.7% on the Visual Task Adaptation Benchmark (which includes 19 tasks). On small datasets, BiT attains 86.4% on ILSVRC-2012 with 25 examples per class, and 97.6% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.
Computation Reallocation for Object Detection
Machine Learning, Computer Vision and Pattern Recognition. 7 authors. pdf
The allocation of computation resources in the backbone is a crucial issue in object detection. However, classification allocation pattern is usually adopted directly to object detector, which is proved to be sub-optimal. …
In order to reallocate the engaged computation resources in a more efficient way, we present CR-NAS (Computation Reallocation Neural Architecture Search) that can learn computation reallocation strategies across different feature resolution and spatial position diectly on the target detection dataset. A two-level reallocation space is proposed for both stage and spatial reallocation. A novel hierarchical search procedure is adopted to cope with the complex search space. We apply CR-NAS to multiple backbones and achieve consistent improvements. Our CR-ResNet50 and CR-MobileNetV2 outperforms the baseline by 1.9% and 1.7% COCO AP respectively without any additional computation budget. The models discovered by CR-NAS can be equiped to other powerful detection neck/head and be easily transferred to other dataset, e.g. PASCAL VOC, and other vision tasks, e.g. instance segmentation. Our CR-NAS can be used as a plugin to improve the performance of various networks, which is demanding.
Audio-based automatic mating success prediction of giant pandas
Audio and Speech Processing, Machine Learning, Sound. 7 authors. pdf
Giant pandas, stereotyped as silent animals, make significantly more vocal sounds during breeding season, suggesting that sounds are essential for coordinating their reproduction and expression of mating preference. Previous biological studies have also proven that giant panda sounds are correlated with mating results and reproduction. …
This paper makes the first attempt to devise an automatic method for predicting mating success of giant pandas based on their vocal sounds. Given an audio sequence of mating giant pandas recorded during breeding encounters, we first crop out the segments with vocal sound of giant pandas, and normalize its magnitude, and length. We then extract acoustic features from the audio segment and feed the features into a deep neural network, which classifies the mating into success or failure. The proposed deep neural network employs convolution layers followed by bidirection gated recurrent units to extract vocal features, and applies attention mechanism to force the network to focus on most relevant features. Evaluation experiments on a data set collected during the past nine years obtain promising results, proving the potential of audio-based automatic mating success prediction methods in assisting giant panda reproduction.
TRADI: Tracking deep neural network weight distributions
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 5 authors. pdf
During training, the weights of a Deep Neural Network (DNN) are optimized from a random initialization towards a nearly optimum value minimizing a loss function. Only this final state of the weights is typically kept for testing, while the wealth of information on the geometry of the weight space, accumulated over the descent towards the minimum is discarded. …
In this work we propose to make use of this knowledge and leverage it for computing the distributions of the weights of the DNN. This can be further used for estimating the epistemic uncertainty of the DNN by sampling an ensemble of networks from these distributions. To this end we introduce a method for tracking the trajectory of the weights during optimization, that does not require any changes in the architecture nor on the training procedure. We evaluate our method on standard classification and regression benchmarks, and on out-of-distribution detection for classification and semantic segmentation. We achieve competitive results, while preserving computational efficiency in comparison to other popular approaches.
Attention-Aware Answers of the Crowd
Machine Learning, Machine Learning. 5 authors. pdf
Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated data needs to be further maintained. …
Various solutions have been attempted to obtain high-quality annotations. However, they all assume that workers’ label quality is stable over time (always at the same level whenever they conduct the tasks). In practice, workers’ attention level changes over time, and the ignorance of which can affect the reliability of the annotations. In this paper, we focus on a novel and realistic crowdsourcing scenario involving attention-aware annotations. We propose a new probabilistic model that takes into account workers’ attention to estimate the label quality. Expectation propagation is adopted for efficient Bayesian inference of our model, and a generalized Expectation Maximization algorithm is derived to estimate both the ground truth of all tasks and the label-quality of each individual crowd worker with attention. In addition, the number of tasks best suited for a worker is estimated according to changes in attention. Experiments against related methods on three real-world and one semi-simulated datasets demonstrate that our method quantifies the relationship between workers’ attention and label-quality on the given tasks, and improves the aggregated labels.
Machine Learning (cs.LG)
Adaptive Distraction Context Aware Tracking Based on Correlation Filter
Computer Vision and Pattern Recognition. 5 authors. pdf
The Discriminative Correlation Filter (CF) uses a circulant convolution operation to provide several training samples for the design of a classifier that can distinguish the target from the background. The filter design may be interfered by objects close to the target during the tracking process, resulting in tracking failure. …
This paper proposes an adaptive distraction context aware tracking algorithm to solve this problem. In the response map obtained for the previous frame by the CF algorithm, we adaptively find the image blocks that are similar to the target and use them as negative samples. This diminishes the influence of similar image blocks on the classifier in the tracking process and its accuracy is improved. The tracking results on video sequences show that the algorithm can cope with rapid changes such as occlusion and rotation, and can adaptively use the distractive objects around the target as negative samples to improve the accuracy of target tracking.
Deep Manifold Embedding for Hyperspectral Image Classification
Image and Video Processing, Computer Vision and Pattern Recognition. 5 authors. pdf
Deep learning methods have played a more and more important role in hyperspectral image classification. However, the general deep learning methods mainly take advantage of the information of sample itself or the pairwise information between samples while ignore the intrinsic data structure within the whole data. …
To tackle this problem, this work develops a novel deep manifold embedding method(DMEM) for hyperspectral image classification. First, each class in the image is modelled as a specific nonlinear manifold and the geodesic distance is used to measure the correlation between the samples. Then, based on the hierarchical clustering, the manifold structure of the data can be captured and each nonlinear data manifold can be divided into several sub-classes. Finally, considering the distribution of each sub-class and the correlation between different subclasses, the DMEM is constructed to preserve the estimated geodesic distances on the data manifold between the learned low dimensional features of different samples. Experiments over three real-world hyperspectral image datasets have demonstrated the effectiveness of the proposed method.
PILS: Exploring high-order neighborhoods by pattern mining and injection
Artificial Intelligence. 4 authors. pdf
We introduce pattern injection local search (PILS), an optimization strategy that uses pattern mining to explore high-order local-search neighborhoods, and illustrate its application on the vehicle routing problem. PILS operates by storing a limited number of frequent patterns from elite solutions. …
During the local search, each pattern is used to define one move in which 1) incompatible edges are disconnected, 2) the edges defined by the pattern are reconnected, and 3) the remaining solution fragments are optimally reconnected. Each such move is accepted only in case of solution improvement. As visible in our experiments, this strategy results in a new paradigm of local search, which complements and enhances classical search approaches in a controllable amount of computational time. We demonstrate that PILS identifies useful high-order moves (e.g., 9-opt and 10-opt) which would otherwise not be found by enumeration, and that it significantly improves the performance of state-of-the-art population-based and neighborhood-centered metaheuristics.
Bidding in Spades
Artificial Intelligence. 4 authors. pdf
We present a Spades bidding algorithm that is superior to recreational human players and to publicly available bots. Like in Bridge, the game of Spades is composed of two independent phases, and . …
This paper focuses on the bidding algorithm, since this phase holds a precise challenge: based on the input, choose the bid that maximizes the agent’s winning probability. Our (BIS) algorithm heuristically determines the bidding strategy by comparing the expected utility of each possible bid. A major challenge is how to estimate these expected utilities. To this end, we propose a set of domain-specific heuristics, and then correct them via machine learning using data from real-world players. The algorithm we present can be attached to any playing algorithm. It beats rule-based bidding bots when all use the same playing component. When combined with a rule-based playing algorithm, it is superior to the average recreational human.
ADD-Lib: Decision Diagrams in Practice
Machine Learning, Programming Languages, Artificial Intelligence, Software Engineering. 4 authors. pdf
In the paper, we present the ADD-Lib, our efficient and easy to use framework for Algebraic Decision Diagrams (ADDs). The focus of the ADD-Lib is not so much on its efficient implementation of individual operations, which are taken by other established ADD frameworks, but its ease and flexibility, which arise at two levels: the level of individual ADD-tools, which come with a dedicated user-friendly web-based graphical user interface, and at the meta level, where such tools are specified. …
Both levels are described in the paper: the meta level by explaining how we can construct an ADD-tool tailored for Random Forest refinement and evaluation, and the accordingly generated Web-based domain-specific tool, which we also provide as an artifact for cooperative experimentation. In particular, the artifact allows readers to combine a given Random Forest with their own ADDs regarded as expert knowledge and to experience the corresponding effect.
Attack-Resistant Federated Learning with Residual-based Reweighting
Machine Learning, Machine Learning. 4 authors. pdf
Federated learning has a variety of applications in multiple domains by utilizing private training data stored on different devices. However, the aggregation process in federated learning is highly vulnerable to adversarial attacks so that the global model may behave abnormally under attacks. …
To tackle this challenge, we present a novel aggregation algorithm with residual-based reweighting to defend federated learning. Our aggregation algorithm combines repeated median regression with the reweighting scheme in iteratively reweighted least squares. Our experiments show that our aggregation algorithm outperforms other alternative algorithms in the presence of label-flipping, backdoor, and Gaussian noise attacks. We also provide theoretical guarantees for our aggregation algorithm.
Mining User Behaviour from Smartphone data, a literature review
Machine Learning, Machine Learning. 4 authors. pdf
To study users’ travel behaviour and travel time between origin and destination, researchers employ travel surveys. Although there is consensus in the field about the potential, after over ten years of research and field experimentation, Smartphone-based travel surveys still did not take off to a large scale. …
Here, computer intelligence algorithms take the role that operators have in Traditional Travel Surveys; since we train each algorithm on data, performances rest on the data quality, thus on the ground truth. Inaccurate validations affect negatively: labels, algorithms’ training, travel diaries precision, and therefore data validation, within a very critical loop. Interestingly, boundaries are proven burdensome to push even for Machine Learning methods. To support optimal investment decisions for practitioners, we expose the drivers they should consider when assessing what they need against what they get. This paper highlights and examines the critical aspects of the underlying research and provides some recommendations: (i) from the device perspective, on the main physical limitations; (ii) from the application perspective, the methodological framework deployed for the automatic generation of travel diaries; (iii)from the ground truth perspective, the relationship between user interaction, methods, and data.
Meta-Learning PAC-Bayes Priors in Model Averaging
Machine Learning, Machine Learning. 4 authors. pdf
Nowadays model uncertainty has become one of the most important problems in both academia and industry. In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model’s uncertainty to improve reliability and accuracy of inferences. …
Here one main challenge is to learn the prior over the model set. To tackle this problem, we propose two data-based algorithms to get proper priors for model averaging. One is for meta-learner, the analysts should use historical similar tasks to extract the information about the prior. The other one is for base-learner, a subsampling method is used to deal with the data step by step. Theoretically, an upper bound of risk for our algorithm is presented to guarantee the performance of the worst situation. In practice, both methods perform well in simulations and real data studies, especially with poor quality data.
Assessing differentially private deep learning with Membership Inference
Machine Learning, Cryptography and Security. 4 authors. pdf
Releasing data in the form of trained neural networks with differential privacy promises meaningful anonymization. However, there is an inherent privacy-accuracy trade-off in differential privacy which is challenging to assess for non-privacy experts. …
Furthermore, local and central differential privacy mechanisms are available to either anonymize the training data or the learnt neural network, and the privacy parameter \(\epsilon\) cannot be used to compare these two mechanisms. We propose to measure privacy through a black-box membership inference attack and compare the privacy-accuracy trade-off for different local and central differential privacy mechanisms. Furthermore, we need to evaluate whether differential privacy is a useful mechanism in practice since differential privacy will especially be used by data scientists if membership inference risk is lowered more than accuracy. We experiment with several datasets and show that neither local differential privacy nor central differential privacy yields a consistently better privacy-accuracy trade-off in all cases. We also show that the relative privacy-accuracy trade-off, instead of strictly declining linearly over \(\epsilon\), is only favorable within a small interval. For this purpose we propose \(\varphi\), a ratio expressing the relative privacy-accuracy trade-off.
Artificial Intelligence (cs.AI)
Stochastic Fairness and Language-Theoretic Fairness in Planning on Nondeterministic Domains
Formal Languages and Automata Theory, Artificial Intelligence. 3 authors. pdf
We address two central notions of fairness in the literature of planning on nondeterministic fully observable domains. The first, which we call stochastic fairness, is classical, and assumes an environment which operates probabilistically using possibly unknown probabilities. …
The second, which is language-theoretic, assumes that if an action is taken from a given state infinitely often then all its possible outcomes should appear infinitely often (we call this state-action fairness). While the two notions coincide for standard reachability goals, they diverge for temporally extended goals. This important difference has been overlooked in the planning literature, and we argue has led to confusion in a number of published algorithms which use reductions that were stated for state-action fairness, for which they are incorrect, while being correct for stochastic fairness. We remedy this and provide an optimal sound and complete algorithm for solving state-action fair planning for LTL/LTLf goals, as well as a correct proof of the lower bound of the goal-complexity (our proof is general enough that it provides new proofs also for the no-fairness and stochastic-fairness cases). Overall, we show that stochastic fairness is better behaved than state-action fairness.
Characterizing the Decision Boundary of Deep Neural Networks
Machine Learning, Machine Learning, Computer Vision and Pattern Recognition. 3 authors. pdf
Deep neural networks and in particular, deep neural classifiers have become an integral part of many modern applications. Despite their practical success, we still have limited knowledge of how they work and the demand for such an understanding is evergrowing. …
In this regard, one crucial aspect of deep neural network classifiers that can help us deepen our knowledge about their decision-making behavior is to investigate their decision boundaries. Nevertheless, this is contingent upon having access to samples populating the areas near the decision boundary. To achieve this, we propose a novel approach we call Deep Decision boundary Instance Generation (DeepDIG). DeepDIG utilizes a method based on adversarial example generation as an effective way of generating samples near the decision boundary of any deep neural network model. Then, we introduce a set of important principled characteristics that take advantage of the generated instances near the decision boundary to provide multifaceted understandings of deep neural networks. We have performed extensive experiments on multiple representative datasets across various deep neural network models and characterized their decision boundaries.
Detection of Community Structures in Networks with Nodal Features based on Generative Probabilistic Approach
Social and Information Networks, Machine Learning. 3 authors. pdf
Community detection is considered as a fundamental task in analyzing social networks. Even though many techniques have been proposed for community detection, most of them are based exclusively on the connectivity structures. …
However, there are node features in real networks, such as gender types in social networks, feeding behavior in ecological networks, and location on e-trading networks, that can be further leveraged with the network structure to attain more accurate community detection methods. We propose a novel probabilistic graphical model to detect communities by taking into account both network structure and nodes’ features. The proposed approach learns the relevant features of communities through a generative probabilistic model without any prior assumption on the communities. Furthermore, the model is capable of determining the strength of node features and structural elements of the networks on shaping the communities. The effectiveness of the proposed approach over the state-of-the-art algorithms is revealed on synthetic and benchmark networks.
Cryptography and Security (cs.CR)
FHDR: HDR Image Reconstruction from a Single LDR Image using Feedback Network
Image and Video Processing, Computer Vision and Pattern Recognition. 3 authors. pdf
High dynamic range (HDR) image generation from a single exposure low dynamic range (LDR) image has been made possible due to the recent advances in Deep Learning. Various feed-forward Convolutional Neural Networks (CNNs) have been proposed for learning LDR to HDR representations. …
To better utilize the power of CNNs, we exploit the idea of feedback, where the initial low level features are guided by the high level features using a hidden state of a Recurrent Neural Network. Unlike a single forward pass in a conventional feed-forward network, the reconstruction from LDR to HDR in a feedback network is learned over multiple iterations. This enables us to create a coarse-to-fine representation, leading to an improved reconstruction at every iteration. Various advantages over standard feed-forward networks include early reconstruction ability and better reconstruction quality with fewer network parameters. We design a dense feedback block and propose an end-to-end feedback network- FHDR for HDR image generation from a single exposure LDR image. Qualitative and quantitative evaluations show the superiority of our approach over the state-of-the-art methods.
Robust Visual Tracking via Implicit Low-Rank Constraints and Structural Color Histograms
Computer Vision and Pattern Recognition. 3 authors. pdf
With the guaranteed discrimination and efficiency of spatial appearance model, Discriminative Correlation Filters (DCF-) based tracking methods have achieved outstanding performance recently. However, the construction of effective temporal appearance model is still challenging on account of filter degeneration becomes a significant factor that causes tracking failures in the DCF framework. …
To encourage temporal continuity and to explore the smooth variation of target appearance, we propose to enhance low-rank structure of the learned filters, which can be realized by constraining the successive filters within a \(\ell_2\)-norm ball. Moreover, we design a global descriptor, structural color histograms, to provide complementary support to the final response map, improving the stability and robustness to the DCF framework. The experimental results on standard benchmarks demonstrate that our Implicit Low-Rank Constraints and Structural Color Histograms (ILRCSCH) tracker outperforms state-of-the-art methods.
Neural and Evolutionary Computing (cs.NE)
Multi-Graph Transformer for Free-Hand Sketch Recognition
Machine Learning, Computer Vision and Pattern Recognition. 3 authors. pdf
Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). …
In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel Graph Neural Network (GNN), the Multi-Graph Transformer (MGT), for learning representations of sketches from multiple graphs which simultaneously capture global and local geometric stroke structures, as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: (i) achieves small recognition gap to the CNN-based performance upper bound (72.80% vs. 74.22%), and (ii) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
Computational framework for monolithic coupling for thin fluid flow in contact interfaces
Computational Physics, Numerical Analysis, Numerical Analysis, Computational Engineering, Finance, and Science. 3 authors. pdf
We develop a computational framework for simulating thin fluid flow in narrow interfaces between contacting solids, which is relevant for a range engineering, biological and geophysical applications. The treatment of this problem requires coupling between fluid and solid mechanics equations, further complicated by contact constraints and potentially complex geometrical features of contacting surfaces. …
We develop a monolithic finite-element framework for handling contact, thin incompressible viscous flow and fluid-induced tractions on the surface of the solid, suitable for both one- and two-way coupling approaches. Additionally, we consider fluid entrapment in “pools” delimited by contact patches and its pressurisation following a non-linear compressible constitutive law. Image analysis algorithms are adopted to identify the local status of each interface element (i.e. distinguish between contact, fluid flow and trapped fluid zones) within the Newton convergence loop. First, an application of the proposed framework for a problem with a model geometry is given, and the robustness is demonstrated by the DOF-wise and status-wise convergence. The full capability of the developed two-way coupling framework is demonstrated on a problem of a fluid flow in a contact interface between a solid with representative rough surface and a rigid flat. The evolution of the contact pressure, fluid flow pattern and the morphology of trapped fluid zones under increasing external load until the complete sealing of the interface is displayed. Finally, effective properties of flat-on-flat rough contact interfaces such as transmissivity and real contact area growth are calculated using the developed framework, showing qualitatively new results compared to the one-way coupling approximation.
Computational Engineering, Finance, and Science (cs.CE)
Online Algorithms for Multiclass Classification using Partial Labels
Machine Learning, Machine Learning. 2 authors. pdf
In this paper, we propose online algorithms for multiclass classification using partial labels. We propose two variants of Perceptron called Avg Perceptron and Max Perceptron to deal with the partial labeled data. …
We also propose Avg Pegasos and Max Pegasos, which are extensions of Pegasos algorithm. We also provide mistake bounds for Avg Perceptron and regret bound for Avg Pegasos. We show the effectiveness of the proposed approaches by experimenting on various datasets and comparing them with the standard Perceptron and Pegasos.
Social and Information Networks (cs.SI)
mRMR-DNN with Transfer Learning for IntelligentFault Diagnosis of Rotating Machines
Machine Learning, Machine Learning. 2 authors. pdf
In recent years, intelligent condition-based monitoring of rotary machinery systems has become a major research focus of machine fault diagnosis. In condition-based monitoring, it is challenging to form a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation. …
Along with that, the generated data have a large number of redundant features which degraded the performance of the machine learning models. To overcome this, we have utilized the advantages of minimum redundancy maximum relevance (mRMR) and transfer learning with deep learning model. In this work, mRMR is combined with deep learning and deep transfer learning framework to improve the fault diagnostics performance in term of accuracy and computational complexity. The mRMR reduces the redundant information from data and increases the deep learning performance, whereas transfer learning, reduces a large amount of data dependency for training the model. In the proposed work, two frameworks, i.e., mRMR with deep learning and mRMR with deep transfer learning, have explored and validated on CWRU and IMS rolling element bearings datasets. The analysis shows that the proposed frameworks are able to obtain better diagnostic accuracy in comparison of existing methods and also able to handle the data with a large number of features more quickly.
Software Engineering (cs.SE)
Towards Multicellular Biological Deep Neural Nets Based on Transcriptional Regulation
Molecular Networks, Neural and Evolutionary Computing, Emerging Technologies. 1 authors. pdf
Artificial neurons built on synthetic gene networks have potential applications ranging from complex cellular decision-making to bioreactor regulation. Furthermore, due to the high information throughput of natural systems, it provides an interesting candidate for biologically-based supercomputing and analog simulations of traditionally intractable problems. …
In this paper, we propose an architecture for constructing multicellular neural networks and programmable nonlinear systems. We design an artificial neuron based on gene regulatory networks and optimize its dynamics for modularity. Using gene expression models, we simulate its ability to perform arbitrary linear classifications from multiple inputs. Finally, we construct a two-layer neural network to demonstrate scalability and nonlinear decision boundaries, and discuss future directions for utilizing uncontrolled neurons in computational tasks.
Sound (cs.SD)
An Analisys of Application Logs with Splunk : developing an App for the synthetic analysis of data and security incidents
Machine Learning, Cryptography and Security. 1 authors. pdf
The present work aims to enhance the application logs of an hypothetical infrastructure platform, and to build an App that displays the synthetic data about performance, anomalies and security incidents synthesized in the form of a Dashboard. The reference architecture, with multiple applications and multiple HW distribution, implementing a Service Oriented Architecture, is a real case of which the details have been abstracted because we want to extend the concept to all architectures with similar characteristics. …

Statistics

Methodology (stat.ME)
Aggregating predictions from experts: a scoping review of statistical methods, experiments, and applications
Applications. 4 authors. pdf
Forecasts support decision making in a variety of applications. Statistical models can produce accurate forecasts given abundant training data, but when data is sparse, rapidly changing, or unavailable, statistical models may not be able to make accurate predictions. …
Expert judgmental forecasts—models that combine expert-generated predictions into a single forecast—can make predictions when training data is limited by relying on expert intuition to take the place of concrete training data. Researchers have proposed a wide array of algorithms to combine expert predictions into a single forecast, but there is no consensus on an optimal aggregation model. This scoping review surveyed recent literature on aggregating expert-elicited predictions. We gathered common terminology, aggregation methods, and forecasting performance metrics, and offer guidance to strengthen future work that is growing at an accelerated pace.
Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer
Machine Learning, Machine Learning, Cryptography and Security. 4 authors. pdf
Collaborative (federated) learning enables multiple parties to train a model without sharing their private data, but through repeated sharing of the parameters of their local models. Despite its advantages, this approach has many known privacy and security weaknesses and performance overhead, in addition to being limited only to models with homogeneous architectures. …
Shared parameters leak a significant amount of information about the local (and supposedly private) datasets. Besides, federated learning is severely vulnerable to poisoning attacks, where some participants can adversarially influence the aggregate parameters. Large models, with high dimensional parameter vectors, are in particular highly susceptible to privacy and security attacks: curse of dimensionality in federated learning. We argue that sharing parameters is the most naive way of information exchange in collaborative learning, as they open all the internal state of the model to inference attacks, and maximize the model’s malleability by stealthy poisoning attacks. We propose Cronus, a robust collaborative machine learning framework. The simple yet effective idea behind designing Cronus is to control, unify, and significantly reduce the dimensions of the exchanged information between parties, through robust knowledge transfer between their black-box local models. We evaluate all existing federated learning algorithms against poisoning attacks, and we show that Cronus is the only secure method, due to its tight robustness guarantee. Treating local models as black-box, reduces the information leakage through models, and enables us using existing privacy-preserving algorithms that mitigate the risk of information leakage through the model’s output (predictions). Cronus also has a significantly lower sample complexity, compared to federated learning, which does not bind its security to the number of participants.
Applications (stat.AP)
Power Comparisons in 2x2 Contingency Tables: Odds Ratio versus Pearson Correlation versus Canonical Correlation
Methodology. 3 authors. pdf
It is an important inferential problem to test no association between two binary variables based on data. Tests based on the sample odds ratio are commonly used. …
We bring in a competing test based on the Pearson correlation coefficient. In particular, the Odds ratio does not extend to higher order contingency tables, whereas Pearson correlation does. It is important to understand how Pearson correlation stacks against the odds ratio in 2x2 tables. Another measure of association is the canonical correlation. In this paper, we examine how competitive Pearson correlation is vis-`a-vis odds ratio in terms of power in the binary context, contrasting further with both the Wald Z and Rao Score tests. We generated an extensive collection of joint distributions of the binary variables and estimated the power of the tests under each joint alternative distribution based on random samples. The consensus is that none of the tests dominates the other.
Machine Learning (stat.ML)
Bayesian Aggregation
Methodology. 1 authors. pdf
A general challenge in statistics is prediction in the presence of multiple candidate models or learning algorithms. Model aggregation tries to combine all predictive distributions from individual models, which is more stable and flexible than single model selection. …
In this article we describe when and how to aggregate models under the lens of Bayesian decision theory. Among two widely used methods, Bayesian model averaging (BMA) and Bayesian stacking, we compare their predictive performance, and review their theoretical optimality, probabilistic interpretation, practical implementation, and extensions in complex models.

Elec. Eng. and Systems Science

Image and Video Processing (eess.IV)
Comparison of the P300 detection accuracy related to the BCI speller and image recognition scenarios
Neurons and Cognition, Human-Computer Interaction, Machine Learning, Signal Processing. 4 authors. pdf
There are several protocols in the Electroencephalography (EEG) recording scenarios which produce various types of event-related potentials (ERP). P300 pattern is a well-known ERP which produced by auditory and visual oddball paradigm and BCI speller system. …
In this study, P300 and non-P300 separability are investigated in two scenarios including image recognition paradigm and BCI speller. Image recognition scenario is an experiment that examines the participants, knowledge about an image that shown to them before by analyzing the EEG signal recorded during the observing of that image as visual stimulation. To do this, three types of famous classifiers (SVM, Bayes LDA, and sparse logistic regression) were used to classify EEG recordings in six classes problem. Filtered and down-sampled (temporal samples) of EEG recording were considered as features in classification P300 pattern. Also, different sets of EEG recording including 4, 8 and 16 channels and different trial numbers were used to considering various situations in comparison. The accuracy was increased by increasing the number of trials and channels. The results prove that better accuracy is observed in the case of the image recognition scenario for the different sets of channels and by using the different number of trials. So it can be concluded that P300 pattern which produced in image recognition paradigm is more separable than BCI (matrix speller).
Signal Processing (eess.SP)
Robustness of Brain Tumor Segmentation
Image and Video Processing, Computer Vision and Pattern Recognition, Machine Learning. 3 authors. pdf
We address the generalization behavior of deep neural networks in the context of brain tumor segmentation. While current topologies show an increasingly complex structure, the overall benchmark performance does improve negligibly. …
In our experiments, we demonstrate that a well trained U-Net shows the best generalization behavior and is sufficient to solve this segmentation problem. We illustrate why extensions of this model cannot only be pointless but even harmful in a realistic scenario. Also, we suggest two simple modifications (that do not alter the topology) to further improve its generalization performance.

Other

Astrophysics of Galaxies (astro-ph.GA)
Evolution of the accretion disk-corona during bright hard-to-soft state transition: A reflection spectroscopic study with GX 339-4
High Energy Physics - Experiment, High Energy Astrophysical Phenomena, Plasma Physics, General Relativity and Quantum Cosmology, Data Analysis, Statistics and Probability. 6 authors. pdf
We present the analysis of several observations of the black hole binary GX 339–4 during its bright intermediate states from two different outbursts (2002 and 2004), as observed by RXTE/PCA. We perform a consistent study of its reflection spectrum by employing the relxill family of relativistic reflection models to probe the evolutionary properties of the accretion disk including the inner disk radius (\(R_{\rm in}\)), ionization parameter (\(\xi\)), temperatures of the inner disk (\(T_{\rm in}\)), corona (\(kT_{\rm e}\)), and its optical depth (\(\tau\)). …
Our analysis indicates that the disk inner edge approaches the inner-most stable circular orbit (ISCO) during the early onset of bright hard state, and that the truncation radius of the disk remains low (\(&lt; 9 R_{\rm g}\)) throughout the transition from hard to soft state. This suggests that the changes observed in the accretion disk properties during the state transition are driven by variation in accretion rate, and not necessarily due to changes in the inner disk’s radius. We compare the aforementioned disk properties in two different outbursts, with state transitions occurring at dissimilar luminosities, and find identical evolutionary trends in the disk properties, with differences only seen in corona’s \(kT_{\rm e}\) and \(\tau\). We also perform an analysis by employing a self-consistent Comptonized accretion disk model accounting for the scatter of disk photons by the corona, and measure low inner disk truncation radius across the bright intermediate states, using the temperature dependent values of spectral hardening factor, thereby independently confirming our results from the reflection spectrum analysis.
High Energy Astrophysical Phenomena (astro-ph.HE)
Evaporative cooling of icy interstellar grains. I
Computational Physics, Astrophysics of Galaxies, Space Physics. 2 authors. pdf
Context. While radiative cooling of interstellar grains is a well-known process, little detail is known about the cooling of grains with an icy mantle that contains volatile adsorbed molecules. …
Aims. We explore basic details for the cooling process of an icy grain with properties relevant to dark interstellar clouds. Methods. Grain cooling was described with a numerical code considering a grain with an icy mantle that is structured in monolayers and containing several volatile species in proportions consistent with interstellar ice. Evaporation was treated as first-order decay. Diffusion and subsequent thermal desorption of bulk-ice species was included. Temperature decrease from initial temperatures of 100, 90, 80, 70, 60, 50, 40, 30, and 20K was studied, and we also followed the composition of ice and evaporated matter. Results. We find that grain cooling occurs by partially successive and partially overlapping evaporation of different species. The most volatile molecules (N2) first evaporate at the greatest rate and are most rapidly depleted from the outer ice monolayers. The most important coolant is CO, but evaporation of more refractory species, such as CH4 and even CO2, is possible when the former volatiles are not available. Cooling of high-temperature grains takes longer because volatile molecules are depleted faster and the grain has to switch to slow radiative cooling at a higher temperature. For grain temperatures above 40K, most of the thermal energy is carried away by evaporation. Evaporation of the nonpolar volatile species induces a complete change of the ice surface, as the refractory polar molecules (H2O) are left behind. Conclusions. The effectiveness of thermal desorption from heated icy grains (e.g., the yield of cosmic-ray-induced desorption) is primarily controlled by the thermal energy content of the grain and the number and availability of volatile molecules.

Quantitative Finance

Risk Management (q-fin.RM)
Online Quantification of Input Model Uncertainty by Two-Layer Importance Sampling
Applications, Risk Management. 2 authors. pdf
Stochastic simulation has been widely used to analyze the performance of complex stochastic systems and facilitate decision making in those systems. Stochastic simulation is driven by the input model, which is a collection of probability distributions that model the stochasticity in the system. …
The input model is usually estimated using a finite amount of data, which introduces the so-called input model uncertainty (or, input uncertainty for short) to the simulation output. How to quantify input uncertainty has been studied extensively, and many methods have been proposed for the batch data setting, i.e., when all the data are available at once. However, methods for ``streaming data’’ arriving sequentially in time are still in demand, despite that streaming data have become increasingly prevalent in modern applications. To fill in this gap, we propose a two-layer importance sampling framework that incorporates streaming data for online input uncertainty quantification. Under this framework, we develop two algorithms that suit two different application scenarios: the first is when data come at a fast speed and there is no time for any simulation in between updates; the second is when data come at a moderate speed and a few but limited simulations are allowed at each time stage. We show the consistency and asymptotic convergence rate results, which theoretically show the efficiency of our proposed approach. We further demonstrate the proposed algorithms on an example of the news vendor problem.
Statistical Finance (q-fin.ST)
The Dynamics of Financial Markets: Fibonacci numbers, Elliott waves, and solitons
Statistical Finance. 1 authors. pdf
In this paper information theoretical approach is applied to the description of financial markets. A model which is expected to describe the markets dynamics is presented. …
It is shown the possibility to describe market trend and cycle dynamics from a unified viewpoint. The model predictions comparatively well suit Fibonacci ratios and numbers used for the analysis of market price and time projections. It proves possible to link time and price projections, thus allowing increase the accuracy of predicting well in advance the moment of trend termination. The model is tested against real data from the stock and financial markets.

Quantum Physics

Quantum Physics (quant-ph)
Device-independent Randomness Expansion with Entangled Photons
Data Analysis, Statistics and Probability, Quantum Physics. 15 authors. pdf
With the growing availability of experimental loophole-free Bell tests, it has become possible to implement a new class of device-independent random number generators whose output can be certified to be uniformly random without requiring a detailed model of the quantum devices used. However, all of these experiments require many input bits in order to certify a small number of output bits, and it is an outstanding challenge to develop a system that generates more randomness than is used. …
Here, we devise a device-independent spot-checking protocol which uses only uniform bits as input. Implemented with a photonic loophole-free Bell test, we can produce 24% more certified output bits (1,181,264,237) than consumed input bits (953,301,640), which is 5 orders of magnitude more efficient than our previous work [arXiv:1812.07786]. The experiment ran for 91.0 hours, creating randomness at an average rate of 3606 bits/s with a soundness error bounded by \(5.7\times 10^{-7}\) in the presence of classical side information. Our system will allow for greater trust in public sources of randomness, such as randomness beacons, and the protocols may one day enable high-quality sources of private randomness as the device footprint shrinks.
Continuous-variable quantum cryptography with discrete alphabets: Composable security under collective Gaussian attacks
Data Analysis, Statistics and Probability, Quantum Physics. 2 authors. pdf
We consider continuous-variable quantum key distribution with discrete-alphabet encodings. In particular, we study protocols where information is encoded in the phase of displaced coherent (or thermal) states, even though the results can be directly extended to any protocol based on finite constellations of displaced Gaussian states. …
In this setting, we provide a composable security analysis in the finite-size regime assuming the realistic but restrictive hypothesis of collective Gaussian attacks. Under this assumption, we can efficiently estimate the parameters of the channel via maximum likelihood estimators and bound the corresponding error in the final secret key rate.

Mathematics

Statistics Theory (math.ST)
Universal Inference Using the Split Likelihood Ratio Test
Statistics Theory, Machine Learning, Statistics Theory, Methodology. 3 authors. pdf
We propose a general method for constructing hypothesis tests and confidence sets that have finite sample guarantees without regularity conditions. We refer to such procedures as universal. ...</summary><br>'' The method is very simple and is based on a modified version of the usual likelihood ratio statistic, that we callthe split likelihood ratio test’’ (split LRT). The method is especially appealing for irregular statistical models. Canonical examples include mixture models and models that arise in shape-constrained inference. %mixture models and shape-constrained models are just two examples. Constructing tests and confidence sets for such models is notoriously difficult. Typical inference methods, like the likelihood ratio test, are not useful in these cases because they have intractable limiting distributions. In contrast, the method we suggest works for any parametric model and also for some nonparametric models. The split LRT can also be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid \(p\)-values and confidence sequences.

Physics

Computational Physics (physics.comp-ph)
Interfacial Atomic Number Contrast in Thick TEM Samples
Computational Physics, Instrumentation and Detectors. 2 authors. pdf
The atomic number contrast imaging technique reveals an increase in intensity at interfaces of a high and low-density material in case of relatively thick samples. Elastic scattering factors and absorption coefficients are incorporated in a probabilistic model to study atomic contrast occurring at the interface of two materials when the High-Angle Annular Dark-Field (HAADF) detector is used in the Scanning TEM (STEM) mode. …
Simulations of thick samples reveal that electrons traverse from a higher density material to a lower density material near the interface which increases the HAADF-STEM signal. This effect is more dominant in TEM samples of thickness greater than 100 nm and the increase in signal occurs up to 20 nm from the interface. The behavior of electrons near the interface is explained by comparing the simulation results with experimental TEM micrographs in the HAADF-STEM mode.

Quantitative Biology

Quantitative Methods (q-bio.QM)
TF3P: Three-dimensional Force Fields Fingerprint Learned by Deep Capsular Network
Biomolecules, Machine Learning, Quantitative Methods. 8 authors. pdf
Molecular fingerprints are the workhorse in ligand-based drug discovery. In recent years, increasing number of research papers reported fascinating results on using deep neural networks to learn 2D molecular representations as fingerprints. …
One may anticipate that the integration of deep learning would also contribute to the prosperity of 3D fingerprints. Here, we presented a new 3D small molecule fingerprint, the three-dimensional force fields fingerprint (TF3P), learned by deep capsular network whose training is in no need of labeled dataset for specific predictive tasks. TF3P can encode the 3D force fields information of molecules and demonstrates its stronger ability to capture 3D structural changes, recognize molecules alike in 3D but not in 2D, and recognize similar targets inaccessible by other fingerprints, including the solely existing 3D fingerprint E3FP, based on only ligands similarity. Furthermore, TF3P is compatible with both statistical models (e.g. similarity ensemble approach) and machine learning models. Altogether, we report TF3P as a new 3D small molecule fingerprint with promising future in ligand-based drug discovery.