Achieving k-Anonymity by clustering in attribute hierarchical structures

Paper


Li, Jiuyong, Wong, Raymond Chi-Wing, Fu, Ada Wai-Chee and Pei, Jian. 2006. "Achieving k-Anonymity by clustering in attribute hierarchical structures." Tjoa, A. Min and Trujillo, Juan (ed.) 8th International Conference on Data Warehousing and Knowledge Discovery. Krakow, Poland 04 - 08 Sep 2006 Germany. Springer. https://doi.org/10.1007/11823728
Paper/Presentation Title

Achieving k-Anonymity by clustering in attribute hierarchical structures

Presentation TypePaper
AuthorsLi, Jiuyong (Author), Wong, Raymond Chi-Wing (Author), Fu, Ada Wai-Chee (Author) and Pei, Jian (Author)
EditorsTjoa, A. Min and Trujillo, Juan
Journal or Proceedings TitleLecture Notes in Computer Science (Book series)
Journal Citation4081, pp. 405-416
Number of Pages12
Year2006
PublisherSpringer
Place of PublicationGermany
ISSN1611-3349
0302-9743
ISBN9783540377368
Digital Object Identifier (DOI)https://doi.org/10.1007/11823728
Web Address (URL) of Paperhttps://link.springer.com/chapter/10.1007/11823728_39
Conference/Event8th International Conference on Data Warehousing and Knowledge Discovery
Event Details
8th International Conference on Data Warehousing and Knowledge Discovery
Event Date
04 to end of 08 Sep 2006
Event Location
Krakow, Poland
Abstract

Individual privacy will be at risk if a published data set is not properly de-identified. k-anonymity is a major technique to de-identify a data set. A more general view of k-anonymity is clustering with a constraint of the minimum number of objects in every cluster. Most existing approaches to achieving k-anonymity by clustering are for numerical (or ordinal) attributes. In this paper, we study achieving k-anonymity by clustering in attribute hierarchical structures. We define generalisation distances between tuples to characterise distortions by generalisations and discuss the properties of the distances. We conclude that the generalisation distance is a metric distance. We propose an efficient clustering-based algorithm for k-anonymisation. We experimentally show that the proposed method is more scalable and causes significantly less distortions than an optimal global recoding k-anonymity method.

Keywordsdata mining; privacy preserving; k-anonymity
ANZSRC Field of Research 2020461305. Data structures and algorithms
460908. Information systems organisation and management
Public Notes

File reproduced in accordance with the copyright policy of the publisher/author.

Byline AffiliationsDepartment of Mathematics and Computing
Chinese University of Hong Kong, China
Simon Fraser University, Canada
Permalink -

https://research.usq.edu.au/item/9y0wz/achieving-k-anonymity-by-clustering-in-attribute-hierarchical-structures

Download files


Submitted Version
Li_Wong_Fu_Pei_2006_Authorversion.pdf
File access level: Anyone

  • 2141
    total views
  • 830
    total downloads
  • 1
    views this month
  • 1
    downloads this month

Export as

Related outputs

Affective and sentimental computing
Xie, Haoran, Wong, Tak‑Lam, Wang, Fu Lee, Wong, Raymond, Tao, Xiaohui and Wang, Ran. 2019. "Affective and sentimental computing." International Journal of Machine Learning and Cybernetics. 10 (8), pp. 2043-2044. https://doi.org/10.1007/s13042-019-00977-8
Comparing decision tree and optimal risk pattern mining for analysing Emergency Ultra Short Stay Unit data
Petrus, Khaleel, Li, Jiuyong and Fahey, Paul. 2008. "Comparing decision tree and optimal risk pattern mining for analysing Emergency Ultra Short Stay Unit data." ICMLC 2008: 7th International Conference on Machine Learning and Cybernetics. Kunming, China 12 - 15 Jul 2008 https://doi.org/10.1109/ICMLC.2008.4620410
Satisfying privacy requirements before data anonymization
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Zhang, Yanchun. 2012. "Satisfying privacy requirements before data anonymization ." The Computer Journal. 55 (4), pp. 422-437. https://doi.org/10.1093/comjnl/bxr028
An approximate microaggregation approach for microdata protection
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Zhang, Yanchun. 2012. "An approximate microaggregation approach for microdata protection." Expert Systems with Applications. 39 (2), pp. 2211-2219. https://doi.org/10.1016/j.eswa.2011.04.223
Effective pruning for the discovery of conditional functional dependencies
Li, Jiuyong, Liu, Jixue, Toivonen, Hannu and Yong, Jianming. 2013. "Effective pruning for the discovery of conditional functional dependencies." The Computer Journal. 56 (3), pp. 378-392. https://doi.org/10.1093/comjnl/bxs082
Data privacy against composition attack
Baig, Muzammil M., Li, Jiuyong, Liu, Jixue, Ding, Xiaofeng and Wang, Hua. 2012. "Data privacy against composition attack." Lee, Sang-Goo, Peng, Zhiyong, Zhou, Xiaofang, Moon, Yang-Sae, Unland, Rainer and Yoo, Jaesoo (ed.) 17th International Conference on Database Systems for Advanced Applications (DASFAA 2012). Busan, South Korea 15 - 18 Apr 2012 Berlin, Germany. https://doi.org/10.1007/978-3-642-29038-1_24
Cloning for privacy protection in multiple independent data publications
Baig, Muzammil M., Li, Jiuyong, Liu, Jixue and Wang, Hua. 2011. "Cloning for privacy protection in multiple independent data publications." Berendt, Bettina, de Vries, Arjen, Fan, Wenfei and Macdonald, Craig (ed.) CIKM 2011: 20th ACM Conference on Information and Knowledge Management . Glasgow, United Kingdom 24 - 28 Oct 2011 New York, NY, USA. https://doi.org/10.1145/2063576.2063705
Publishing anonymous survey rating data
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Pei, Jian. 2011. "Publishing anonymous survey rating data." Data Mining and Knowledge Discovery. 23 (3), pp. 379-406. https://doi.org/10.1007/s10618-010-0208-4
Injecting purpose and trust into data anonymisation
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Zhang, Yanchun. 2011. "Injecting purpose and trust into data anonymisation." Computers and Security. 30 (5), pp. 332-345. https://doi.org/10.1016/j.cose.2011.05.005
Achieving p-sensitive k-anonymity via anatomy
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Ross, David. 2009. "Achieving p-sensitive k-anonymity via anatomy." ICEBE 2009: IEEE International Conference on e-Business Engineering . Macau, China 21 - 23 Oct 2009 United States. https://doi.org/10.1109/ICEBE.2009.34
L-diversity based dynamic update for large time-evolving microdata
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2008. "L-diversity based dynamic update for large time-evolving microdata." Wobcke, Wayne and Zhang, Mengjie (ed.) AI 2008: 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence . Auckland, New Zealand 01 - 05 Dec 2008 Germany. Springer. https://doi.org/10.1007/978-3-540-89378-3_47
Authorization approaches for advanced permission-role assignments
Wang, Hua, Yong, Jianming, Li, Jiuyong and Peng, Min. 2008. "Authorization approaches for advanced permission-role assignments." Shen, Weiming, Zheng, Qinghua, Luo, Junzhou, Yong, Jianming, Duan, Zhenhua and Tian, Feng (ed.) CSCWD 2008: 12th International Conference on Computer Supported Cooperative Work in Design . Xi'an, China 16 - 18 Apr 2008 United States. https://doi.org/10.1109/CSCWD.2008.4536994
Portable devices of security and privacy preservation for e-learning
Yong, Jianming, Li, Jiuyong and Wang, Hua. 2008. "Portable devices of security and privacy preservation for e-learning." Shen, Weiming, Zheng, Qinghua, Luo, Junzhou, Yong, Jianming, Duan, Zhenhua and Tian, Feng (ed.) CSCWD 2008: 12th International Conference on Computer Supported Cooperative Work in Design . Xi'an, China 16 - 18 Apr 2008 China. https://doi.org/10.1109/CSCWD.2008.4537121
Robustness analysis of diversified ensemble decision tree algorithms for microarray data classification
Hu, Hong, Li, Jiuyong, Wang, Hua, Daggard, Grant and Wang, Li-Zhen. 2008. "Robustness analysis of diversified ensemble decision tree algorithms for microarray data classification." ICMLC 2008: 7th International Conference on Machine Learning and Cybernetics. Kunming, China 12 - 15 Jul 2008 United States. https://doi.org/10.1109/ICMLC.2008.4620389
(p+, α)-sensitive k-anonymity: a new enhanced privacy protection model
Sun, Xiaoxun, Wang, Hua, Truta, Traian Marius, Li, Jiuyong and Li, Ping. 2008. "(p+, α)-sensitive k-anonymity: a new enhanced privacy protection model." Wu, Qiang (ed.) 8th IEEE International Conference on Computer and Information Technology. Sydney, Australia 08 - 11 Jul 2008 United States. https://doi.org/10.1109/CIT.2008.4594650
On the complexity of restricted k-anonymity problem
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2008. "On the complexity of restricted k-anonymity problem." Yanchun, Zhang (ed.) 10th Asia-Pacific Web Conference (APWeb 2008). Shenyang, China 26 - 28 Apr 2008 Germany. Springer. https://doi.org/10.1007/978-3-540-78849-2_30
A fast algorithm for finding correlation clusters in noise data
Li, Jiuyong, Huang, Xiaodi, Selke, Clinton and Yong, Jianming. 2007. "A fast algorithm for finding correlation clusters in noise data." Zhou, Zhi-Hua, Li, Hang and Yang, Qiang (ed.) 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2007). Nanjing, China 22 - 25 May 2007 Germany. Springer. https://doi.org/10.1007/978-3-540-71701-0_68
Classification using multiple and negative target rules
Li, Jiuyong and Jones, Jason. 2006. "Classification using multiple and negative target rules." Gabrys, Bogdan, Howlett, Robert J. and Jain, Lakhmi C. (ed.) 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2006). Bournemouth, United Kingdom 09 - 11 Oct 2006 Germany. https://doi.org/10.1007/11892960_26
Combined gene selection methods for microarray data analysis
Hu, Hong, Li, Jiuyong, Wang, Hua and Daggard, Grant. 2006. "Combined gene selection methods for microarray data analysis." Gabrys, Bogdan, Howlett, Robert J. and Jain, Lakhmi C. (ed.) 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2006). Bournemouth, United Kingdom 09 - 11 Oct 2006 Germany. Springer. https://doi.org/10.1007/11892960_117
Robust rule-based prediction
Li, Jiuyong. 2006. "Robust rule-based prediction." IEEE Transactions on Knowledge and Data Engineering. 18 (8), pp. 1043-1054. https://doi.org/10.1109/TKDE.2006.129
On optimal rule discovery
Li, Jiuyong. 2006. "On optimal rule discovery." IEEE Transactions on Knowledge and Data Engineering. 18 (4), pp. 460-471. https://doi.org/10.1109/TKDE.2006.1599385
Finding similar patterns in microarray data
Chen, Xiangsheng, Li, Jiuyong, Daggard, Grant and Huang, Xiaodi. 2005. "Finding similar patterns in microarray data." Zhang, Shichao and Jarvis, Ray (ed.) AI 2005: Advances in artificial intelligence. Berlin, Germany. Springer. pp. 1272-1276
A framework for role-based group delegation in distributed environments
Wang, Hua, Li, Jiuyong, Addie, Ron, Dekeyser, Stijn and Watson, Richard. 2006. "A framework for role-based group delegation in distributed environments." Estivill-Castro, Vladimir and Dobbie, Gillian (ed.) 29th Australasian Computer Science Conference (ACSC 2006). Hobart, Australia 16 - 19 Jan 2006 Australia.
Efficient discovery of risk patterns in medical data
Li, Jiuyong, Fu, Ada Wai-chee and Fahey, Paul. 2009. "Efficient discovery of risk patterns in medical data." Artificial Intelligence in Medicine. 45 (1), pp. 77-89. https://doi.org/10.1016/j.artmed.2008.07.008
A robust ensemble classification method analysis
Zhang, Zhongwei, Li, Jiuyong, Hu, Hong and Zhou, Hong. 2010. "A robust ensemble classification method analysis." Arabnia, Hamid R. (ed.) Advances in computational biology. New York, NY. United States. Springer. pp. 149-155
Validating privacy requirements in large survey rating data
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2011. "Validating privacy requirements in large survey rating data." Bessis, Nik and Xhafa, Fatos (ed.) Next generation data technologies for collective computational intelligence. Berlin, Germany. Springer. pp. 445-469
Satisfying privacy requirements: one step before anonymization
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2010. "Satisfying privacy requirements: one step before anonymization." Zaki, Mohammed Javeed, Yu, Jeffrey Xu, Ravindran, B. and Pudi, Vikram (ed.) 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2010). Hyderabad, India 21 - 24 Jun 2010 Berlin, Germany. Springer. https://doi.org/10.1007/978-3-642-13657-3_21
Privacy protection for genomic data: current techniques and challenges
Baig, Muzammil M., Li, Jiuyong, Liu, Jixue, Wang, Hua and Wang, Junhu. 2010. "Privacy protection for genomic data: current techniques and challenges." Ras, Zbigniew W. and Tsay, Li-Shang (ed.) Advances in intelligent information systems. Berlin, Germany. Springer. pp. 175-193
On the effectiveness of gene selection for microarray classification methods
Zhang, Zhongwei, Li, Jiuyong, Hu, Hong and Zhou, Hong. 2010. "On the effectiveness of gene selection for microarray classification methods." Nguyen, Ngoc Thanh, Le, Manh Thanh and Swiatek, Jerzy (ed.) 2nd Asian Conference on Intelligent Information and Database Systems (ACIIDS 2010). Hue City, Vietnam 24 - 26 Mar 2010 Heidelberg, Germany. https://doi.org/10.1007/978-3-642-12101-2_31
Microdata protection through approximate microaggregation
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2009. "Microdata protection through approximate microaggregation." Mans, Bernard (ed.) 32nd Australasian Computer Science Conference (ACSC 2009). Wellington, New Zealand 19 - 23 Jan 2009 Adelaide, Australia.
Injecting purpose and trust into data anonymisation
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2009. "Injecting purpose and trust into data anonymisation." Cheung, David, Song, Il-Yeol, Chu, Wesley, Hu, Xiaohua and Lin, Jimmy (ed.) 18th ACM International Conference on Information and Knowledge Management (CIKM 2009) . Hong Kong, China 02 - 06 Nov 2009 New York, USA. https://doi.org/10.1145/1645953.1646166
An integrated model for next page access prediction
Khalil, Faten, Li, Jiuyong and Wang, Hua. 2009. "An integrated model for next page access prediction." International Journal of Knowledge and Web Intelligence. 1 (1/2), pp. 48-80. https://doi.org/10.1504/IJKWI.2009.027925
Enhanced p-sensitive k-anonymity models for privacy preserving data publishing
Sun, Xiaoxun, Wang, Hua, Li, Jiuyong and Truta, Traian Marius. 2008. "Enhanced p-sensitive k-anonymity models for privacy preserving data publishing." Transactions on Data Privacy. 1 (2), pp. 53-66.
Integrating recommendation models for improved web page prediction accuracy
Khalil, Faten, Li, Jiuyong and Wang, Hua. 2008. "Integrating recommendation models for improved web page prediction accuracy." Dobbie, Gillian and Mans, Bernard (ed.) ACSC 2008: 31st Australasian Computer Science Conference. Wollongong, Australia 22 - 25 Jan 2008 Sydney, Australia.
Priority driven K-anonymisation for privacy protection
Sun, Xiaoxun, Wang, Hua and Li, Jiuyong. 2008. "Priority driven K-anonymisation for privacy protection." Roddick, John F., Li, Jiuyong, Christen, Peter and Kennedy, Paul J. (ed.) 7th Australasian Data Mining Conference (AusDM 2008). Glenelg, Adelaide 27 - 28 Nov 2008 Gold Coast, Australia.
Prediction of student actions using weighted Markov models
Huang, Xiaodi, Yong, Jianming, Li, Jiuyong and Gao, Junbin. 2008. "Prediction of student actions using weighted Markov models." Li, Shaozi, Pan, Wei and Yong, Jianming (ed.) IEEE International Symposium on IT in Medicine and Education (ITME 2008) . Xiamen, China 12 - 14 Dec 2008 Piscataway, NJ. United States. https://doi.org/10.1109/ITME.2008.4743842
Using association rules to make rule-based classifiers robust
Hu, Hong and Li, Jiuyong. 2005. "Using association rules to make rule-based classifiers robust." Williams, Hugh E. and Dobbie, Gillian (ed.) ADC 2005: 16th Australasian Database Conference. Newcastle, Australia 31 Jan - 03 Feb 2005 Sydney, Australia.
Current developments of k-anonymous data releasing
Li, Jiuyong, Wang, Hua, Jin, Huidong and Yong, Jianming. 2008. "Current developments of k-anonymous data releasing." Electronic Journal of Health Informatics. 3 (1).
Integrating Markov Model with clustering for predicting web page accesses
Khalil, Faten, Wang, Hua and Li, Jiuyong. 2007. "Integrating Markov Model with clustering for predicting web page accesses." 13th Australasian World Wide Web Conference (AusWeb 2007). Coffs Harbour, Australia 30 Jun - 04 Jul 2007 Australia.
Integrating recommendation models for improved web page prediction accuracy
Khalil, Faten, Wang, Hua and Li, Jiuyong. 2007. "Integrating recommendation models for improved web page prediction accuracy." 13th Australasian World Wide Web Conference (AusWeb 2007). Coffs Harbour, Australia 30 Jun - 04 Jul 2007 Australia.
Mining risk patterns in medical data
Li, Jiuyong, Fu, Ada Wai-Chee, He, Hongxing, Chen, Jie, Jin, Huidong, McAullay, Damien, Williams, Graham, Sparks, Ross and Kelman, Chris. 2005. "Mining risk patterns in medical data." Grossman, R., Bayardo, R., Bennett, K. and Vaidya, J. (ed.) 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2005). Chicago, United States 21 - 24 Aug 2005 United States. https://doi.org/10.1145/1081870.1081971
Association rule discovery with unbalanced class distributions
Gu, Lifang, Li, Jiuyong, He, Hongxing, Williams, Graham, Hawkins, Simon and Kelman, Chris. 2003. "Association rule discovery with unbalanced class distributions." Gedeon, Tamas D. and Fung, Lance Chun Che (ed.) 16th Australian Conference on Artificial Intelligence (AI 2003). Perth, Australia 03 - 05 Dec 2003 Berlin, Germany. https://doi.org/10.1007/978-3-540-24581-0_19
Mining the optimal class association rule set
Li, Jiuyong, Shen, Hong and Topor, Rodney. 2002. "Mining the optimal class association rule set." Knowledge-Based Systems. 15 (7), pp. 399-405. https://doi.org/10.1016/S0950-7051(02)00024-2
Mining informative rule set for prediction
Li, Jiuyong, Shen, Hong and Topor, Rodney. 2004. "Mining informative rule set for prediction." Journal of Intelligent Information Systems. 22 (2), pp. 155-174. https://doi.org/10.1023/B:JIIS.0000012468.25883.a5
Analysis of breast feeding data using data mining methods
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong and Fallon, Tony. 2006. "Analysis of breast feeding data using data mining methods." Christen, Peter, Kennedy, Paul J., Li, Jiuyong, Simoff, Simeon J. and Williams, Graham J. (ed.) 5th Australasian Conference on Data Mining and Analystics (AusDM 2006). Sydney, Australia 29 - 30 Nov 2006 Sydney, Australia.
A maximally diversified multiple decision tree algorithm for microarray data classification
Hu, Hong, Li, Jiuyong, Wang, Hua, Daggard, Grant and Shi, Mingren. 2006. "A maximally diversified multiple decision tree algorithm for microarray data classification." Boden, Mikael and Bailey, Timothy (ed.) Workshop on Intelligent Systems for Bioinformatics (2006). Hobart, Australia 04 Dec 2006 Sydney, Australia.
A framework of combining Markov model with association rules for predicting web page accesses
Khalil, Faten, Li, Jiuyong and Wang, Hua. 2006. "A framework of combining Markov model with association rules for predicting web page accesses." Christen, Peter, Kennedy, Paul J., Li, Jiuyong, Simoff, Simeon J. and Williams, Graham J. (ed.) 5th Australasian Conference on Data Mining and Analystics (AusDM 2006). Sydney, Australia 29 - 30 Nov 2006 Canberra, Australia.
A comparative study of classification methods for microarray data analysis
Hu, Hong, Li, Jiuyong, Plank, Ashley, Wang, Hua and Daggard, Grant. 2006. "A comparative study of classification methods for microarray data analysis." Christen, Peter, Kennedy, Paul J., Li, Jiuyong, Simoff, Simeon J. and Williams, Graham J. (ed.) 5th Australasian Conference on Data Mining and Analystics (AusDM 2006). Sydney, Australia 29 - 30 Nov 2006 Canberra, Australia.
(alpha, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
Wong, Raymond Chi-Wing, Li, Jiuyong, Fu, Ada Wai-Chee and Wang, Ke. 2006. "(alpha, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing." Eliassi-Rad, Tina, Ungar, Lyle H., Craven, Mark and Gunopulos, Dimitrios (ed.) 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06). Philadelphia, USA 20 - 23 Aug 2006 New York, USA.
Using multiple and negative target rules to make classifiers more understandable
Li, Jiuyong and Jones, Jason. 2006. "Using multiple and negative target rules to make classifiers more understandable." Knowledge-Based Systems. 19 (6), pp. 438-444. https://doi.org/10.1016/j.knosys.2006.03.003
Current developments of k-anonymous data releasing
Li, Jiuyong, Wang, Hua, Jin, Huidong and Yong, Jianming. 2006. "Current developments of k-anonymous data releasing." Croll, Peter, Morarji, Hasmukh and Au, Richard (ed.) National e-Health Privacy and Security Symposium 2006. Brisbane, Australia 24 - 26 Oct 2006 Brisbane.
Combining gene expression data and gene ontology with the use of a data mining tool
Petrus, Khaleel, Li, Jiuyong and Lopez, J.. 2005. "Combining gene expression data and gene ontology with the use of a data mining tool." Summer Symposium in Bioinformatics: Open Problems in Bioinformatics (BioInfoSummer 2005). Canberra, Australia 28 Nov - 02 Dec 2005 Australian National University (ANU).