์ ๋ ๋์ ํฌ๋กค๋ฌ ๊ธฐ๋ณธ ์ฌ์ฉ๋ฒ - ๋ปฅ๋ซ๋ฆฌ๋ ํ์ด์ฌ ์ฝ๋ ๋ชจ์
์ ๋ ๋์ ์ ๋ฐ์ ๊ดํ์ฌ ๊ฐ๋ตํ๊ฒ ์ ๋ฆฌํ๋ค. ์ด ๋ฌธ์๋ ์ ๋ ๋์ ๋ฒ์ 3 ๊ธฐ์ค์ด๋ค. ์ต๊ทผ 4๋ฒ์ ์ด ์ถ์๋์์ผ๋ ์ฌ์ฉ๋ฐฉ๋ฒ์ด ์ฝ๊ฐ ๋ค๋ฅด๋ ์ด ๋ถ๋ถ์ ํ์ธํ๊ธธ ๋ฐ๋๋ค. ์ฌ์ฉ ๋ฐฉ๋ฒ์ด๋ ์์๋ ๋ฐ๋ก
pythondocs.net
https://sjwiq200.tistory.com/11
[PYTHON] Selenium ์์ ํค๋ User-Agent ๊ฐ ์์ ํ๊ธฐ
์๋ ํ์ธ์!! ์ค๋์ Python๊ณผ Selenium์ ํ์ฉํด์ ํฌ๋กค๋ง ํ๋ ๋์ค์ ํ ์ฌ์ดํธ๊ฐ ์ค์ง IE์์๋ง ์๋ํ๋ ๊ฒ์ ์์์ต๋๋ค ใ ใ ...... ๊ทธ๋์ ํด๊ฒฐ์ฑ ์ด ํค๋๊ฐ์ User-Agent ๊ฐ์ IE์ ๊ฐ์ผ๋ก ๋ฐ๊ฟ์ฃผ
sjwiq200.tistory.com
from selenium import webdriver
import Config
options = webdriver.ChromeOptions()
# options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko")
driver = webdriver.Chrome(executable_path=Config.CONFIG['CHROMEPATH'],options=options)
driver.get('')
https://github.com/liamcryan/iherb
GitHub - liamcryan/iherb: Get iherb products and details
Get iherb products and details. Contribute to liamcryan/iherb development by creating an account on GitHub.
github.com
์น์ฌ์ดํธ ํฌ๋กค๋งํ๋ ค๋๋ฐ html error ๋๋ฌธ์ ์งํ์ด ์๋๋ค์..
import urllib.request from bs4 import BeautifulSoup url = 'https://kr.iherb.com/search?kw=21st%20century' html = urllib.request.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') address = soup.find_all(class_='absolute-link
hashcode.co.kr
import urllib.request
from bs4 import BeautifulSoup
url = 'https://kr.iherb.com/search?kw=21st%20century'
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')
address = soup.find_all(class_='absolute-link product-link')
for i in address:
print(i.attrs['href'])
print()
url = 'https://somewhere.com'
request = urllib.request.Request(url)
request.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:99.0) Gecko/20100101 Firefox/99.0')
html = urllib.request.urlopen(request).read()
https://gogl3.github.io/articles/2021-03/webcrawling
Web-crawling using Python
Today, we are going to know how to crawl iherb using Python, especially information about supplements. Scheme Before writing codes, it is important to decide which information is needed and how to store. In this example, assume that I want to collect suppl
gogl3.github.io
'Python > ์คํฌ๋กค๋ง' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
[Python] Pyppeteer (0) | 2022.07.25 |
---|---|
[Python] Selenium (0) | 2022.05.04 |