Error getting data from website
Michael Torrie wrote:
> On 12/6/19 5:31 PM, DL Neil via Python-list wrote:
>> If you read the HTML data that the REPL has happily splattered all over
>> your terminal's screen (scroll back) (NB "soup" is easier to read than
>> is "content"!) you will observe that what you saw in your web-browser is
>> not what Amazon served in response to the Python "requests.get()"!
That's not the problem here. Quoting the html returned by
To discuss automated access to Amazon data please contact api-services-
support at amazon.com.
If you retrieve the page manually:
$ wget "https://www.amazon.ca/dp/B07RZFQ6HC" -O tmp.gz
2019-12-07 11:47:03 (80,6 KB/s) - ?tmp.gz? gespeichert 
$ gunzip tmp.gz
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(open("tmp").read())
>>> soup.find("span", dict(id="priceblock_dealprice")
<span class="a-size-medium a-color-price priceBlockDealPriceString"
> So scraping static html is probably not going to get you where you want
> to go.
... because Amazon doesn' like what you do. You can cheat or play by their
rules and use the API.
> There are heavier tools, such as Selenium that uses a real
> browser to grab a page, and the result of that you can parse and search