Big data analytic untuk pembuatan rekomendasi koleksi film personal menggunakan Mlib. Apache Spark
Indah Survyana Wahyudi(1*)
(1) Sekolah Tinggi Energi dan Mineral-Akamigas
(*) Corresponding Author
Abstract
Introduction. The digital age is characterized by the explosion of digital information that creates problems in information retrieval. Search engines have a weakness in the keywords/queries that users can remember. Recommendations arise as solutions to provide personal information.
Data Collection Method. In this paper, the researcher presented a recommendation engine model using dataset from movielends.org.
Analysis Data. Alternating Least Square-Weight Regulation (ALS-WR) was used as a big data analytic algorithm in rating prediction and Cosine Similiarity as the second filter to bring items closer to the genre.
Results and Discussions.The results of Root Mean Squared Error (RMSE) from 100K datasets were 0.96 (validation) and 0.94 (test). The results RMSE from 1M dataset were 0.86 (validation) and 0.96 (test). The results RMSE from 10M dataset were 0.81 (validation) and 0.81 (test). The result cosine similarity was 1 for 100% resemblance and it decreased based on the similarity level. The user acceptance test was 28% user accepts the result of first recommendation, this value increased to 62% acceptance level of the user against the second recommendation.
Conclusions. The final results show that 75% of respondents prefer the second recommendation from two-stage filtering than just collaborative filtering.
Keywords
Full Text:
PDFReferences
KBBI-Online. (2018, April 2). diakses tanggal 3 Maret, 2018, dari Kamus Besar Bahasa Indonesia Online: https://kbbi.web.id/film
Anderson, C. (2008). The long tail: Why the future of business is selling less of more. New York: Hachette Books.
Asnov, D. (2011). Algorithms and methods in recommenders systems. Berlin, Germany: Berlin Institute of Technology.
Bobadilla, J. (2010). A new collaborative filtering metric that improves the behavior of recommender systems. Knowledge-Based Systems Journal, 23 (6),520-528.
Garcia, E. (2016, Maret 10). Cosine Similiarity Tutorial. diakses tanggal 3 Maret, 2018, dari minerazzi.com: http://www.minerazzi.com/tutorials/cosine-similarity-tutorial.pdf
Gawesh Jawaheer, M. S. (2010). Comparison of implicit and explicit feedback from an online music recommendation service. International Workshop on Information Heterogeneity and Fusion in Recommender Systems. Barcelona: ACM.
J.Roettgers. (2014). Netflix spends $150 million on content recommendations every year. diakses tanggal 3 Maret, 2018, dari gigaom.com: https://gigaom.com/2014/10/09/netflix-spends-150-million-on-content-recommendations-every-year/
Jogiyanto. (2008). Metodologi penelitian sistem informasi. Yogyakarta: Andi.
Jonnalagadda, V. S. (2016). A review study of apache spark in big data processing. International Journal of Computer Science Trends and Technology (IJCST),, 4 (3), 93-98.
Joseph, D. (2011, Desember 16). Landasan konseptual perencanaan dan perancangan pusat apresiasi film di Yogyakarta. Yogyakarta, DIY Yogyakarta, Indonesia: Universitas Atma Jaya Yogyakarta.
Kadam, S. D. (2017). Big data analytics-recommendation system with Hadoop Framework. Inventive Computation Technologies (ICICT), International Conference on (pp. 1-5). Coimbatore: IEEE.
Khoshgoftar. (2009). A Survey of collaborative filtering techniques. Artivicial Inteliigence Journal, Vol.2009, 1-19.
Levitin, D. J. (2015). The organized mind: Thinking straight in the age of information overload. New York: Dutton.
Melville, P. (2002). Content-boosted collaborative filtering for improved recommendations. Artificial intelligence (pp. 187-192). Menlo Park, CA, USA.: American Association for Artificial Intelligence.
Morrison, B. (2016). What do Google, Netflix, Amazon and best buy have in common? diakses tanggal 3 Maret, 2018, dari nectarom: https://www.nectarom.com/google-netflix-amazon-best-buy-common/
Nee, D. (2016, Desember 17). Collaborative Filtering Using Alternating Least Square. diakses tanggal 3 Maret, 2018, dari danielnee.com: danielnee.com/2016/collaborative-filtering-using-alternating-least-square/
Ondra, F. (2014). Machine learning at Scale. Retrieved Maret 6, 2018, from github: https://github.com/OndraFiedler/spark-recommender/blob/master/reportAndDocumentation.pdf
Phuong. (2014). Collaborative filtering with a graph-based similarity measure. International Conference on Computing, Management and Telecommunications. Da Nang, Vietnam: IEEE.
Pr¨ugel-Bennett, M. A. (2010). An improved switching hybrid recommender system using naive bayes classifier and collaborative filtering. International Multi Conferenceof Engineers and Computer Science. Hongkong: iaeng.org.
Price, D. (2015). Suprising facts and stats about The Big Data industry. diakses tanggal 3 Maret, 2018, dari cloudtweaks.com: http://cloudtweaks.com/2015/03/surprising-facts-and-stats-about-the-big-data-industry/
Ricci, F. (2011). Recommender systems handbook. Springer US.
Roettgers, J. (2014). Netflix spends $150 million on content recommendations every year. diakses tanggal 3 Maret, 2018, dari gigaom.com: https://gigaom.com/2014/10/09/netflix-spends-150-million-on-content-recommendations-every-year/
Sang-Min Choi, Y.-S. H. (2012). A movie recommendation algorithm based on genre correlations. International Journal Expert Systems with Applications (pp. 8079-8085). New York: Pergamon Press.
Song. (2009). A collaborative filtering recommendation algorithm based on item genre and rating similarity. International Conference on Computational Intelligence and Natural Computing. Wuhan, China: IEEE.
Team, T. (2012). netflixs yields 131 value with user recommendation tools. diakses tanggal 3 Maret, 2018, dari Forbes: http://www.forbes.com/sites/greatspeculations/2012/04/17/netflixs-yields-131-value-with-user-recommendation-tools/#5b5a177f199a
Tikk, G. T. (2012). Alternating Least Square for Personalized Ranking. ACM.
Walker, A. (2003). Supporting word of mouth social networks through collaborative filtering. Journal of Interactive Learning Research, 14 (1), 78-79.DOI: https://doi.org/10.22146/bip.32208
Article Metrics
Abstract views : 11604 | views : 7116Refbacks
- There are currently no refbacks.
Copyright (c) 2018 Berkala Ilmu Perpustakaan dan Informasi
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.