Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
Tags
- mapreduce
- Java
- MSSQL
- mybatis
- GIT
- tomcat
- vaadin
- SSL
- NPM
- table
- hadoop
- SPC
- react
- plugin
- es6
- xPlatform
- Python
- 공정능력
- R
- Spring
- Sqoop
- Kotlin
- Eclipse
- Android
- IntelliJ
- 보조정렬
- JavaScript
- SQL
- window
- Express
Archives
- Today
- Total
DBILITY
python html parse and image file save 본문
반응형
- requests
Requests is a simple, yet elegant, HTTP library.
>>> import requests >>> r = requests.get('https://api.github.com/user', auth=('user', 'pass')) >>> r.status_code 200 >>> r.headers['content-type'] 'application/json; charset=utf8' >>> r.encoding 'utf-8' >>> r.text '{"type":"User"...' >>> r.json() {'disk_usage': 368627, 'private_gists': 484, ...} #image save imgRequest = requests.get(image_url) image =open(file_name, mode='wb') image.write(imgRequest.content) image.close() #또는 with open(file_name, 'wb') as image: image.write(imgRequest.content)
- BeautifulSoup4
Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
>>> from bs4 import BeautifulSoup >>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML","html.parser") >>> print(soup.prettify()) <html> <body> <p> Some <b> bad <i> HTML </i> </b> </p> </body> </html> >>> soup.find(text="bad") 'bad' >>> soup.i <i>HTML</i> # >>> soup = BeautifulSoup("<tag1>Some<tag2/>bad<tag3>XML", "xml") # >>> print(soup.prettify()) <?xml version="1.0" encoding="utf-8"?> <tag1> Some <tag2/> bad <tag3> XML </tag3> </tag1> #select(), find_all() 매뉴얼 참조
- urllib
저장된 결과#file저장 #urllib.request.urlretrieve(image_url, file_name) urllib.request.urlretrieve("https://www.crummy.com/software/BeautifulSoup/bs4/doc/_images/6.1.jpg","6.1.jpg")
반응형
'python' 카테고리의 다른 글
python string placeholder? formatting (0) | 2021.08.17 |
---|---|
python lottery number generation exercise ( 로또 번호 생성) (0) | 2021.08.13 |
python archive file 다루기 (0) | 2021.08.12 |
python module ( 모듈 ) (0) | 2019.05.05 |
python user defined function ( 사용자 정의 함수 ) (0) | 2019.05.04 |
Comments