Sahte Web Sitelerinin Sınıflandırma Algoritmaları İle Tespit Edilmesi

Korkmaz, Adem; Büyükgöze, Selma

dc.contributor.author	Korkmaz, Adem
dc.contributor.author	Büyükgöze, Selma
dc.date.accessioned	2021-12-12T16:51:09Z
dc.date.available	2021-12-12T16:51:09Z
dc.date.issued	2019
dc.identifier.issn	2148-2683
dc.identifier.issn	2148-2683
dc.identifier.uri	https://doi.org/10.31590/ejosat.598036
dc.identifier.uri	https://app.trdizin.gov.tr/makale/TXpVNU1USTVPUT09
dc.identifier.uri	https://hdl.handle.net/20.500.11857/2493
dc.description.abstract	Günümüzde kimlik avı yapan sahte web sitelerinin sayısı oldukça artmıştır. Bu web sitelerinin amaçları genel anlamda kişilerin,kişisel bilgilerini ele geçirerek çıkar sağlamaktır. Sosyal medya hesaplarımızdaki kimlik ve parola bilgilerimiz, alışveriş sitelerindekikimlik ve adres bilgilerimiz bize ait kişisel bilgilerimizdir. Bu tür bilgiler istenmeyen kişilerin eline geçmesi durumunda, tahmin bileedemeyeceğimiz kötü sonuçlar doğurabilmektedir. Ayrıca online bankacılık işlemlerimiz gibi finansal işlemlerimizin önemli birkısmını internet ortamında yapıyor olmamız bu tür sitelerden korunmamız açısından önemli bir sorun teşkil etmektedir. Bu amaçlaantivürüs yazılım firmaları, tarayıcılar, arama motorları daha iyi kullanıcı hizmeti ve memnunniyet sağlamak açısından bu tür zararlısitelerden kullanıcılarını korumak için çalışmalar yapmaktadırlar. Ayrıca sahte web sayfalarının kullanıcıların önüne gelmeden tespitedilip engellenmesi günümüz yapay zeka çalışmalarınında önemli bir çalışma alanı olmaktadır. Hergün milyarlarca insanın gezindiğiinternet ortamında bu sahte sitelerden korunmasının en kolay yöntemi, sahte web sayfalarının otomatik olarak tespit edilipengellenmesi olacaktır. Makine öğrenmesi sınıflandırma algoritmaları ile bir sayfaya ait bilgilere bakarak sistem tarafından otomatikolarak sahte veya gerçek olarak tespit edilmesi yapay zeka çalışmalarının sunduğu önemli avantajların başında gelmektedir. Buçalışma ile bir web sitesi adresine ait belirlenmiş 10 özellik kullanılarak; bu adresin sahte mi, yoksa gerçek bir adres mi olduğu tespitedilmeye çalışılmaktadır. Çalışmada kullanılan veriler Machine Learning Repository (UCI)’dan alınmıştır. Verilerin analizi ÇaprazEndüstri Standart Süreç Modeli(CRISP-DM) baz alınarak gerçekleştirilmiştir. Veri setinde web sitelerinin durumunu belirleyen nitelik(Class, Kimlik Avı=-1, Şüpheli=0 ve Meşru=1) olarak etiketlenmiştir. Çalışma da RStudio kullanılarak R programlama dili ileanalizler yapılmıştır. Kullanılan sınıflandırma algoritmaları Rastgele Orman (RF), Destek Vektör Makineleri (SVM), J48, K-En YakınKomşu (KNN) ve Naive Bayes algoritmalarıdır. Yapılan değerlendirmeler sonucunda Rastgele Orman algoritması ile en yüksekdoğruluk performansı elde edilmiştir.	en_US
dc.description.abstract	Nowadays, phishing web sites have been increased. The purpose of these sites is to obtain benefits by acquiring personal information of people in general. Our identity and password information in our social media accounts and identity and address information on shopping sites are our personal information. If such information is received by unwanted people, it can have bad unpredictable consequences. In addition, the fact that we carry out a significant portion of our financial transactions such as our online banking transactions on the internet constitutes an important problem in terms of protection from such sites. For this purpose, antivirus software companies, browsers, search engines are working to protect users from such harmful sites in terms of providing better user service and satisfaction. In addition, the detection and prevention of fake web pages before the users is an important area of work in today's artificial intelligence studies. The easiest method of protecting these fraudulent sites in the internet environment where billions of people are browsing every day will be to detect and block fake web pages automatically. Machine learning classification algorithms are automatically identified as fake or real by the system by looking at the information of a page and this is one of the important advantages offered by artificial intelligence studies. With this study, using 10 properties determined for a website address; it is attempted to determine whether this address is a fake or a real address. The data used in this study were taken from Machine Learning Repository (UCI). Data analysis was performed based on the Cross Industry Standard Process Model (CRISP-DM). In the data set, it is labeled as the attribute that determines the status of websites (Class, Phishing = -1, Suspicious = 0 and Legitimate = 1). The study was also done by using RStudio analysis with R programming language. The classification algorithms used are Random Forest (RF), Support Vector Machines (SVM), J48, K-Nearest Neighbor (KNN) and Naive Bayes algorithms. The highest accuracy performance was obtained by Random Forest algorithm.	en_US
dc.language.iso	tur	en_US
dc.relation.ispartof	Avrupa Bilim ve Teknoloji Dergisi	en_US
dc.identifier.doi	10.31590/ejosat.598036
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	[No Keywords]	en_US
dc.title	Sahte Web Sitelerinin Sınıflandırma Algoritmaları İle Tespit Edilmesi	en_US
dc.title.alternative	Detection of Fake Websites by Classification Algorithms	en_US
dc.type	article
dc.department	Meslek Yüksekokulları, Teknik Bilimler Meslek Yüksekokulu, Bilgisayar Teknolojileri Bölümü
dc.identifier.volume	0	en_US
dc.identifier.startpage	826	en_US
dc.identifier.issue	16	en_US
dc.identifier.endpage	833	en_US
dc.relation.publicationcategory	Makale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı	en_US

Bu öğenin dosyaları:

Ad:: 2493.pdf
Boyut:: 1.021Mb
Biçim:: PDF
Açıklama:: Tam Metin / Full Text

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Makale Koleksiyonu [73]
TR-Dizin İndeksli Yayınlar Koleksiyonu [1037]
TR-Dizin Indexed Publications Collection

Basit öğe kaydını göster