Hacking | Scripts | Download Pdf of website

Bakul Gupta
2 min readMar 15, 2019

--

Today I was looking for pdf of a book on the internet, simply I used the google dorking in order to make my search fruitful and easier.

index of: /hacking pdf

Using this dork, I was able to find out the book that I want to have.

But when I open the website, then I find other interesting books.I have left with two options

  1. Either download each file manually, by clicking on the each link
  2. Automate the process.

So, I have chosen the second step,I automate the process. There can be various ways to automate the process, but here I am considering the python to do this.

Python Modules Used

a. BeautifulSoup

b.Urllib

c.Requests

#! /usr/bin/python3

# GitHub :- Bullhacks3

# Modules Import
from bs4 import BeautifulSoup
from urllib import request

def files_downloader(url):
site_url=request.urlopen(url)
soup=BeautifulSoup(site_url,’html.parser’)
for links in soup.findAll(‘a’):
if links[‘href’].split(‘.’)[-1] == ‘pdf’:
print (‘File Downloading is in process :-’+str(links[‘href’]))
full_url=url+links[‘href’]
storage=’/media/bakul/@Bakul@/CyberSecurity/Books’+links[‘href’]
request.urlretrieve(full_url,storage)
print (‘File successfully downloaded’)

if __name__ == “__main__” :
take_input=str(input(“Enter the url of the website :-”))
files_downloader(take_input)

Output :-

--

--

Bakul Gupta

Security Engineer | Trying to learn something new each and everyday!!!