References of "Wang, Jun 50003303"
     in
Bookmark and Share    
Full Text
See detailPrivacy-preserving Recommender Systems Facilitated By The Machine Learning Approach
Wang, Jun UL

Doctoral thesis (2018)

Recommender systems, which play a critical role in e-business services, are closely linked to our daily life. For example, companies such as Youtube and Amazon are always trying to secure their profit by ... [more ▼]

Recommender systems, which play a critical role in e-business services, are closely linked to our daily life. For example, companies such as Youtube and Amazon are always trying to secure their profit by estimating personalized user preferences and recommending the most relevant items (e.g., products, news, etc.) to each user from a large number of candidates. State-of-the-art recommender systems are often built on top of collaborative filtering techniques, of which the accuracy performance relies on precisely modeling user-item interactions by analyzing massive user historical data, such as browsing history, purchasing records, locations and so on. Generally, more data can lead to more accurate estimations and more commercial strategies, as such, service providers have incentives to collect and use more user data. On the one hand, recommender systems bring more income to service providers and more convenience to users; on the other hand, the user data can be abused, arising immediate privacy risks to the public. Therefore, how to preserve privacy while enjoying recommendation services becomes an increasingly important topic to both the research community and commercial practitioners. The privacy concerns can be disparate when constructing recommender systems or providing recommendation services under different scenarios. One scenario is that, a service provider wishes to protect its data privacy from the inference attack, a technique aims to infer more information (e.g., whether a record is in or not) about a database, by analyzing statistical outputs; the other scenario is that, multiple users agree to jointly perform a recommendation task, but none of them is willing to share their private data with any other users. Security primitives, such as homomorphic encryption, secure multiparty computation, and differential privacy, are immediate candidates to address privacy concerns. A typical approach to build efficient and accurate privacy-preserving solutions is to improve the security primitives, and then apply them to existing recommendation algorithms. However, this approach often yields a solution far from the satisfactory-of-practice, as most users have a low tolerance to the latency-increase or accuracy-drop, regarding recommendation services. The PhD program explores machine learning aided approaches to build efficient privacy-preserving solutions for recommender systems. The results of each proposed solution demonstrate that machine learning can be a strong assistant for privacy-preserving, rather than only a troublemaker. [less ▲]

Detailed reference viewed: 487 (42 UL)
Peer Reviewed
See detailFacilitating Privacy-preserving Recommendation-as-a-Service with Machine Learning
Wang, Jun UL; Delerue Arriaga, Afonso UL; Tang, Qiang et al

Poster (2018, October)

Machine-Learning-as-a-Service has become increasingly popular, with Recommendation-as-a-Service as one of the representative examples. In such services, providing privacy protection for users is an ... [more ▼]

Machine-Learning-as-a-Service has become increasingly popular, with Recommendation-as-a-Service as one of the representative examples. In such services, providing privacy protection for users is an important topic. Reviewing privacy-preserving solutions which were proposed in the past decade, privacy and machine learning are often seen as two competing goals at stake. Though improving cryptographic primitives (e.g., secure multi-party computation (SMC) or homomorphic encryption (HE)) or devising sophisticated secure protocols has made a remarkable achievement, but in conjunction with state-of-the-art recommender systems often yields far-from-practical solutions. We tackle this problem from the direction of machine learning. We aim to design crypto-friendly recommendation algorithms, thus to obtain efficient solutions by directly using existing cryptographic tools. In particular, we propose an HE-friendly recommender system, refer to as CryptoRec, which (1) decouples user features from latent feature space, avoiding training the recommendation model on encrypted data; (2) only relies on addition and multiplication operations, making the model straightforwardly compatible with HE schemes. The properties turn recommendation-computations into a simple matrix-multiplication operation. To further improve efficiency, we introduce a sparse-quantization-reuse method which reduces the recommendation-computation time by $9\times$ (compared to using CryptoRec directly), without compromising the accuracy. We demonstrate the efficiency and accuracy of CryptoRec on three real-world datasets. CryptoRec allows a server to estimate a user's preferences on thousands of items within a few seconds on a single PC, with the user's data homomorphically encrypted, while its prediction accuracy is still competitive with state-of-the-art recommender systems computing over clear data. Our solution enables Recommendation-as-a-Service on large datasets in a nearly real-time (seconds) level. [less ▲]

Detailed reference viewed: 203 (6 UL)
Full Text
Peer Reviewed
See detailDifferentially Private Neighborhood-based Recommender Systems
Wang, Jun UL; Tang, Qiang

in IFIP Information Security & Privacy Conference (2017, May)

Privacy issues of recommender systems have become a hot topic for the society as such systems are appearing in every corner of our life. In contrast to the fact that many secure multi-party computation ... [more ▼]

Privacy issues of recommender systems have become a hot topic for the society as such systems are appearing in every corner of our life. In contrast to the fact that many secure multi-party computation protocols have been proposed to prevent information leakage in the process of recommendation computation, very little has been done to restrict the information leakage from the recommendation results. In this paper, we apply the differential privacy concept to neighborhood-based recommendation methods (NBMs) under a probabilistic framework. We first present a solution, by directly calibrating Laplace noise into the training process, to differential-privately find the maximum a posteriori parameters similarity. Then we connect differential privacy to NBMs by exploiting a recent observation that sampling from the scaled posterior distribution of a Bayesian model results in provably differentially private systems. Our experiments show that both solutions allow promising accuracy with a modest privacy budget, and the second solution yields better accuracy if the sampling asymptotically converges. We also compare our solutions to the recent differentially private matrix factorization (MF) recommender systems, and show that our solutions achieve better accuracy when the privacy budget is reasonably small. This is an interesting result because MF systems often offer better accuracy when differential privacy is not applied. [less ▲]

Detailed reference viewed: 230 (17 UL)
Full Text
Peer Reviewed
See detailA Probabilistic View of Neighborhood-based Recommendation Methods
Wang, Jun UL; Tang, Qiang

in ICDM 2016 - IEEE International Conference on Data Mining series (ICDM) workshop CLOUDMINE (2016, December 12)

Probabilistic graphic model is an elegant framework to compactly present complex real-world observations by modeling uncertainty and logical flow (conditionally independent factors). In this paper, we ... [more ▼]

Probabilistic graphic model is an elegant framework to compactly present complex real-world observations by modeling uncertainty and logical flow (conditionally independent factors). In this paper, we present a probabilistic framework of neighborhood-based recommendation methods (PNBM) in which similarity is regarded as an unobserved factor. Thus, PNBM leads the estimation of user preference to maximizing a posterior over similarity. We further introduce a novel multi-layer similarity descriptor which models and learns the joint influence of various features under PNBM, and name the new framework MPNBM. Empirical results on real-world datasets show that MPNBM allows very accurate estimation of user preferences. [less ▲]

Detailed reference viewed: 161 (10 UL)
Full Text
Peer Reviewed
See detailPrivacy-preserving Friendship-based Recommender Systems
Tang, Qiang; Wang, Jun UL

in IEEE Transactions on Dependable and Secure Computing (2016, November)

Privacy-preserving recommender systems have been an active research topic for many years. However, until today, it is still a challenge to design an efficient solution without involving a fully trusted ... [more ▼]

Privacy-preserving recommender systems have been an active research topic for many years. However, until today, it is still a challenge to design an efficient solution without involving a fully trusted third party or multiple semitrusted third parties. The key obstacle is the large underlying user populations (i.e. huge input size) in the systems. In this paper, we revisit the concept of friendship-based recommender systems, proposed by Jeckmans et al. and Tang and Wang. These solutions are very promising because recommendations are computed based on inputs from a very small subset of the overall user population (precisely, a user’s friends and some randomly chosen strangers). We first clarify the single prediction protocol and Top-n protocol by Tang and Wang, by correcting some flaws and improving the efficiency of the single prediction protocol. We then design a decentralized single protocol by getting rid of the semi-honest service provider. In order to validate the designed protocols, we crawl Twitter and construct two datasets (FMT and 10-FMT) which are equipped with auxiliary friendship information. Based on 10-FMT and MovieLens 100k dataset with simulated friendships, we show that even if our protocols use a very small subset of the datasets, their accuracy can still be equal to or better than some baseline algorithm. Based on these datasets, we further demonstrate that the outputs of our protocols leak very small amount of information of the inputs, and the leakage decreases when the input size increases. We finally show that he single prediction protocol is quite efficient but the Top-n is not. However, we observe that the efficiency of the Top-n protocol can be dramatically improved if we slightly relax the desired security guarantee. [less ▲]

Detailed reference viewed: 171 (9 UL)
Full Text
Peer Reviewed
See detailRecommender Systems and their Security Concerns
Wang, Jun UL; Tang, Qiang

Scientific Conference (2015, October)

Instead of simply using two-dimensional User × Item features, advanced recommender systems rely on more additional dimensions (e.g. time, location, social network) in order to provide better ... [more ▼]

Instead of simply using two-dimensional User × Item features, advanced recommender systems rely on more additional dimensions (e.g. time, location, social network) in order to provide better recommendation services. In the first part of this paper, we will survey a variety of dimension features and show how they are integrated into the recommendation process. When the service providers collect more and more personal information, it brings great privacy concerns to the public. On another side, the service providers could also suffer from attacks launched by malicious users who want to bias the recommendations. In the second part of this paper, we will survey attacks from and against recommender service providers, and existing solutions. [less ▲]

Detailed reference viewed: 474 (7 UL)
Full Text
Peer Reviewed
See detailPrivacy-Preserving Context-Aware Recommender Systems: Analysis and New Solutions
Tang, Qiang UL; Wang, Jun UL

in Computer Security - ESORICS 2015 - 20th European Symposium on Research in Computer Security (2015, September)

Nowadays, recommender systems have become an indispens- able part of our daily life and provide personalized services for almost everything. However, nothing is for free – such systems have also upset the ... [more ▼]

Nowadays, recommender systems have become an indispens- able part of our daily life and provide personalized services for almost everything. However, nothing is for free – such systems have also upset the society with severe privacy concerns because they accumulate a lot of personal information in order to provide recommendations. In this work, we construct privacy-preserving recommendation protocols by incorpo- rating cryptographic techniques and the inherent data characteristics in recommender systems. We first revisit the protocols by Jeckmans et al. and show a number of security issues. Then, we propose two privacy- preserving protocols, which compute predicted ratings for a user based on inputs from both the user’s friends and a set of randomly chosen strangers. A user has the flexibility to retrieve either a predicted rating for an unrated item or the Top-N unrated items. The proposed protocols prevent information leakage from both protocol executions and the pro- tocol outputs. Finally, we use the well-known MovieLens 100k dataset to evaluate the performances for different parameter sizes. [less ▲]

Detailed reference viewed: 209 (15 UL)
Full Text
Peer Reviewed
See detailRecalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse.
Orlando, Ludovic; Ginolhac, Aurélien UL; Zhang, Guojie et al

in Nature (2013), 499(7456), 74-8

The rich fossil record of equids has made them a model for evolutionary processes. Here we present a 1.12-times coverage draft genome from a horse bone recovered from permafrost dated to approximately 560 ... [more ▼]

The rich fossil record of equids has made them a model for evolutionary processes. Here we present a 1.12-times coverage draft genome from a horse bone recovered from permafrost dated to approximately 560-780 thousand years before present (kyr BP). Our data represent the oldest full genome sequence determined so far by almost an order of magnitude. For comparison, we sequenced the genome of a Late Pleistocene horse (43 kyr BP), and modern genomes of five domestic horse breeds (Equus ferus caballus), a Przewalski's horse (E. f. przewalskii) and a donkey (E. asinus). Our analyses suggest that the Equus lineage giving rise to all contemporary horses, zebras and donkeys originated 4.0-4.5 million years before present (Myr BP), twice the conventionally accepted time to the most recent common ancestor of the genus Equus. We also find that horse population size fluctuated multiple times over the past 2 Myr, particularly during periods of severe climatic changes. We estimate that the Przewalski's and domestic horse populations diverged 38-72 kyr BP, and find no evidence of recent admixture between the domestic horse breeds and the Przewalski's horse investigated. This supports the contention that Przewalski's horses represent the last surviving wild horse population. We find similar levels of genetic variation among Przewalski's and domestic populations, indicating that the former are genetically viable and worthy of conservation efforts. We also find evidence for continuous selection on the immune system and olfaction throughout horse evolution. Finally, we identify 29 genomic regions among horse breeds that deviate from neutrality and show low levels of genetic variation compared to the Przewalski's horse. Such regions could correspond to loci selected early during domestication. [less ▲]

Detailed reference viewed: 213 (29 UL)