Phishing is a type of social engineering attack that can affect any company or anyone. This paper explores the effect that different features and optimisation techniques have on the accuracy of intelligent phishing detection using machine learning algorithms. This paper explores both hyperparameter optimisation as well as feature selection optimisation. For hyperparameter tuning, both TPE (Tree-structured Parzen Estimator) and GA (Genetic Algorithm) were tested, with the best option being model dependent. For feature selection, GA, MFO (Moth Flame Optimisation) and PSO (Particle Swarm Optimisation) were used with PSO working best with a Random Forest model. This work used URL (Uniform Resource Locator), DOM (Document Object Model) structure, page rank and page information related features. This research found that the best combination was Random Forest using PSO for feature selection and TPE for hyperparameter optimisation, giving an accuracy of 99.33%.
|Number of pages||8|
|Publication status||Accepted/In press - 14 Oct 2020|
|Event||19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications - Guangzhou University, Guangzhou, China|
Duration: 29 Dec 2020 → 1 Jan 2021
|Conference||19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications|
|Abbreviated title||TrustCom 2020|
|Period||29/12/20 → 1/01/21|