Adam Mohiuddin
This is your About Page. It's a great opportunity to give a full background on who you are, what you do, and what your website has to offer. Double click on the text box to start editing your content and make sure to add all the relevant details you want to share with site visitors.
My Project
Abstract
Web scraping, a powerful data extraction technique, has gained traction across various sectors. This study evaluates the capabilities of web scraping, particularly in mobile applications, with an emphasis on real-time data access and personalized content delivery. By leveraging web scraping, mobile apps can centralize and streamline access to large datasets, such as scholarship searches, to provide innovative user experiences. This research identifies ethical challenges, evaluates the benefits and drawbacks, and highlights its potential for responsible, data-driven solutions.
Introduction
Web scraping is defined as "the automated extraction of information from websites." Originating in the 1990s, advancements in tools like BeautifulSoup and Scrapy made it more sophisticated. This study explores the integration of web scraping into mobile apps, especially for tasks like reducing user effort in accessing vast amounts of information. However, challenges arise, including navigating legal and ethical concerns and adapting to dynamic HTML structures. As AI and machine learning evolve, the role of web scraping in various applications will likely expand.
Methodology
The project aimed to integrate web scraping into a mobile app to simplify scholarship searches. The app, ScholarSync (developed for iOS), aggregates publicly available scholarship information from platforms like Fastweb and Scholarships.com. It offers centralized search, filter functionality, bookmarking, and detailed views to streamline the user experience.
Implementation
The app was developed using Swift, Xcode, and SwiftSoup for HTML parsing. Scripts extracted scholarship details such as name, description, deadline, and URL from multiple sources with varying HTML structures. Challenges included inaccessible data and repeated elements, which required adaptive algorithms. Data was securely stored in Firebase, with strict user authentication protocols to ensure privacy and maintain ethical compliance.
Evaluation
ScholarSync simplifies scholarship searches but has limitations like slow data retrieval and the need for expanded API usage. Future improvements could include advanced filtering, integration with official APIs, and enhanced compliance with legal frameworks.
Conclusion
Web scraping proves to be a valuable tool for mobile apps, providing centralized access to diverse data while ensuring user-friendly interfaces. The findings underscore the importance of ethical use, including adherence to privacy laws and obtaining permissions when scraping content. While limitations exist, such as varying HTML structures and data retrieval inefficiencies, the potential of web scraping to drive innovation in mobile apps is significant. Developers must strike a balance between user convenience and respecting data ethics.
Acknowledgments
I extend my deepest gratitude to the researchers and mentors who supported this project. Special thanks to Mohit Nadkarni for his guidance and encouragement, as well as Sal Marcuz for providing insights into the legal and ethical aspects of web scraping. This project would not have been possible without their invaluable support.
References
Shubhada, S. (2023, January 4). Web Scraping: What it is and how companies can leverage it. Forbes. Link
Nehal. (2024, May 14). What is a Web Scraper and How Does Web Scraping Work? PromptCloud.
Krijien, D., Bot, R., & Lampropoulos, G. (2014). Automated Web Scraping APIs. Online: https://intelligence.techleads.io.
Khder, M. A. (2021). Web scraping or web crawling: State of art, techniques, approaches, and application. International Journal of Advances in Soft Computing & Its Applications, 13(3).
Contact
I'm always looking for new and exciting opportunities. Let's connect.
123-456-7890