Web Scraping: Legal and Ethical Considerations in General and Local Context-A Review


Kazmali A. S., SAYAR A.

6th International Conference on Futuristic Trends in Networks and Computing Technologies, FTNCT 2024, Uttarakhand, India, 23 - 24 December 2024, vol.259, pp.1563-1572, (Full Text) identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 259
  • Doi Number: 10.1016/j.procs.2025.04.111
  • City: Uttarakhand
  • Country: India
  • Page Numbers: pp.1563-1572
  • Keywords: Etichs, Protection, Web Scraping, Web Scraping Techniques
  • Kocaeli University Affiliated: Yes

Abstract

In today's fast-paced and competitive environment, there is an increasing need for data in a wide variety of different fields such as the finance industry, the manufacturing industry, the artificial intelligence industry, and academic studies. Data can be accessed from a wide variety of databases, whether printed or virtual. Data can be accessed easily and quickly from online sources. Web scraping has become an important tool in the field of data collection in the operation of online sources. Data can be automatically collected from online sources in HyperText Markup Language (HTML), Extensible Markup Language (XML), JavaScript Object Notation (JSON) format with tools coded for web scraping. In order for this study to be a guide for web scraping tool developers, web scraping techniques such as Human Copy Paste, HTML Parsing, API Scraping, XPath have been tried to be explained to developers. On the other hand, Load Balancing, Rate Limiting, IP Blocking, CAPTCHA systems can be used to prevent the negative effects of web scraping on running systems. Data scraping can result in legal or ethical violations of data collected from online sources. Because, it may cause situations such as extraction of personal data, data theft, violation of system policies, blocking of running services, access problems for users. In this research study, the methods of web scraping, methods of preventing scraping, ethical and legal consequences will be discussed as a global and local comparative study and it is aimed to create a guide for web scraper developers in order for researchers, companies and organizations that try to collect data using web scraping to pay attention in ethical and legal terms.