Research


Back

Efficient Detection of Distributed Constraint Violations

Abstract: In many distributed environments, the primary function of monitoring software is to detect anomalies, i.e., instances when system behavior deviates substantially from the norm. In this paper, we propose communication-efficient schemes for the anomaly detection problem, which we model as one of detecting the violation of global constraints defined over distributed system variables. Our approach eliminates the need to continuously track the global system state by decomposing global constraints into local constraints that can be checked efficiently at each site. Only in the occasional event that a local constraint is violated, do we resort to more expensive global constraint checking. We show that the problem of selecting the local constraints, based on frequency distribution of individual system variables, so as to minimize the communication cost is NP-hard. We propose approximation algorithms for computing provably near-optimal (in terms of the number of messages) local constraints. Experimental results with real-life network traffic data sets demonstrate that our technique can reduce message communication overhead by as much as 70% compared to existing data distribution-agnostic approaches.

  • Shipra Agrawal, Supratim Deb, K.V. M. Naidu, Rajeev Rastogi, "Efficient Detection of Distributed Constraint Violations", short paper, ICDE 2007.

    Anomaly Detection and Diagnosis using Passive Monitoring

    Abstract: In this work, we propose techniques for building a low-cost passive monitoring infrastructure for detection and diagnosis of link-level anomalies from path level measurements. Specifically, we attempt to compute appropriate locations of passive monitoring devices in a network and the paths to be monitored by these devices such that the monitoring infrastructure cost and data communication cost, in terms of number of monitored paths, is minimal. For general mesh topologies, we propose greedy algorithms, which can achieve logarithmic approximation factors, for monitor placement and path selection. We show that for the problems of monitor placement and path selection for anomaly detection in general graph topologies, getting better approximation guarantees are hard. We also consider tree topologies typical of enterprise networks, and show that while similar NP-hardness results hold, constant-factor approximation algorithms are possible for such topologies.

  • Shipra Agrawal, K. V. M. Naidu, Rajeev Rastogi, "Diagnosing Link-level Anomalies Using Passive Probes", INFOCOM 2007.

    VoIP service quality monitoring using active and passive probes

    Abstract: Service providers and enterprises all over the world are rapidly deploying Voice over IP (VoIP) networks because of reduced capital and operational expenditure, and easy creation of new services. Voice traffic has stringement requirements on the quality of service, like strict delay and loss requirements, and 99.999% network availability. However, IP networks have not been designed to easily meet the above requirements. Thus, service providers need service quality management tools that can proactively detect and mitigate service quality degradation of VoIP traffic. In this paper, we present active and passive probes that enable service providers to detect service impairments. We use the probes to compute the network parameters (delay, loss and jitter) that can be used to compute the call quality as a Mean Opinion Score using a voice quality metric, E-model. These tools can be used by service providers and enterprises to identify network impairments that cause service quality degradation and take corrective measures in real time so that the impact on the degradation perceived by endusers is minimal.

  • Shipra Agrawal, P. P. S. Narayan, Jeyashankher Ramamirtham, Rajeev Rastogi, Mark Smith, Ken Swanson, Marina Thottan, "VoIP service quality monitoring using active and passive probes", (invited paper) COMSWARE 2006.
  • Shipra Agrawal, C. N. Kanthi, K. V. M. Naidu, Jeyashankher Ramamirtham, Rajeev Rastogi, Scott Satkin, Anand Srinivasan: Monitoring infrastructure for converged networks and services. Bell Labs Technical Journal 12(2): 63-77 (2007)

    A Framework for High-Accuracy Privacy-Preserving Mining

    Abstract: To preserve client privacy in the data mining process, a variety of techniques based on random perturbation of individual data records have been proposed recently. In this paper, we present FRAPP, a generalized matrix-theoretic framework of random perturbation, which facilitates a systematic approach to the design of perturbation mechanisms for privacy-preserving mining. Specifically, FRAPP is used to demonstrate that (a) the prior techniques differ only in their choices for the perturbation matrix elements, and (b) a symmetric perturbation matrix with minimal condition number can be identified, maximizing the accuracy even under strict privacy guarantees. We also propose a novel perturbation mechanism wherein the matrix elements are themselves characterized as random variables, and demonstrate that this feature provides significant improvements in privacy at only a marginal cost in accuracy.

  • S. Agrawal, J. R. Haritsa, "A Framework for High-Accuracy Privacy-Preserving Mining", IEEE International Conference on Data Engineering (ICDE 2005) April 5-8 , 2005, National Center of Sciences, Tokyo, Japan. (ICDE 2005 presentation)

    On Efficiency concerns in Privacy Preserving Data Mining

    Abstract: Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To encourage users to provide correct inputs, we recently proposed a data distortion scheme for association rule mining that simultaneously provides both privacy to the user and accuracy in the mining results. However, mining the distorted database can be orders of magnitude more time-consuming as compared to mining the original database. In this paper, we address this issue and demonstrate that by (a) generalizing the distortion process to perform symbol-specific distortion, (b) appropriately chooosing the distortion parameters, and (c) applying a variety of optimizations in the reconstruction process, runtime efficiencies that are well within an order of magnitude of undistorted mining can be achieved.

  • S. Agrawal, V. Krishnan, J. R. Haritsa, "On Efficiency concerns in Privacy Preserving Data Mining", 9th International Conference on Database Systems For Advanced Applications (DASFAA) 2004, Jeju Island, Korea.