Hacking | Scripts | Download Pdf of website
Today I was looking for pdf of a book on the internet, simply I used the google dorking in order to make my search fruitful and easier.
index of: /hacking pdf
Using this dork, I was able to find out the book that I want to have.
But when I open the website, then I find other interesting books.I have left with two options
- Either download each file manually, by clicking on the each link
- Automate the process.
So, I have chosen the second step,I automate the process. There can be various ways to automate the process, but here I am considering the python to do this.
Python Modules Used
a. BeautifulSoup
b.Urllib
c.Requests
#! /usr/bin/python3
# GitHub :- Bullhacks3
# Modules Import
from bs4 import BeautifulSoup
from urllib import request
def files_downloader(url):
site_url=request.urlopen(url)
soup=BeautifulSoup(site_url,’html.parser’)
for links in soup.findAll(‘a’):
if links[‘href’].split(‘.’)[-1] == ‘pdf’:
print (‘File Downloading is in process :-’+str(links[‘href’]))
full_url=url+links[‘href’]
storage=’/media/bakul/@Bakul@/CyberSecurity/Books’+links[‘href’]
request.urlretrieve(full_url,storage)
print (‘File successfully downloaded’)
if __name__ == “__main__” :
take_input=str(input(“Enter the url of the website :-”))
files_downloader(take_input)
Output :-