Post with python requests - how do I get the correct table data I'm requesting?


anarchy

I am trying to get historical economic calendar data from this website - https://www.investing.com/economic-calendar/ for the following dates (1 Feb 2020 to 5 Feb 2020) .

Today is February 4, 2020.

If I use the https://www.investing.com/economic-calendar/ url below , I can pull the table using beautifulsoup, but I can't select any day other than the current day. I saved a table in a Python script for today (February 4, 2020).

import requests
import pandas as pd
from bs4 import BeautifulSoup

payload = {"country[]":["25","32","6","37","72","22","17","39","14","10","35","43","56","36","110","11","26","12","4","5"],
                "dateFrom":"2020-02-01",
                "dateTo":"2020-02-05",
                "timeZone":"8",
                "timeFilter":"timeRemain",
                "currentTab":"custom",
                "limit_from":"0"}

urlheader = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}

url = "https://www.investing.com/economic-calendar/"

req = requests.post(url, data=payload, headers=urlheader)
print(req)
soup = BeautifulSoup(req.content, "lxml")
table = soup.find('table', id="economicCalendarData")

The table variable looks like thistable variable

I can see that it sends a post request to "https://www.investing.com/economic-calendar/Service/getCalendarFilteredData" whenever I change the date range or filter settings.

Here is the request data I found.

request data

Here is the POST link

post link

So I use the following code instead, as I want to select the dates.

import requests
import pandas as pd
from bs4 import BeautifulSoup

payload = {"country[]":["25","32","6","37","72","22","17","39","14","10","35","43","56","36","110","11","26","12","4","5"],
                "dateFrom":"2020-02-01",
                "dateTo":"2020-02-05",
                "timeZone":"8",
                "timeFilter":"timeRemain",
                "currentTab":"custom",
                "limit_from":"0"}

urlheader = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}

url = "https://www.investing.com/economic-calendar/Service/getCalendarFilteredData"

req = requests.post(url, data=payload, headers=urlheader)
print(req)
soup = BeautifulSoup(req.content, "lxml")
table = soup.find('table', id="economicCalendarData")

But this time, there is no economicCalendarData, so the table variable comes out empty. The soup variable has data in it but there's no table data in it.

This is the table I'm trying to save.

table to save

Like I said earlier, if I use the url as https://www.investing.com/economic-calendar/, I can get the table data for the current day only (4 Feb 2020); no matter what dates I enter into the payload (dateFrom, dateTo).

For some reason, when I try to post to https://www.investing.com/economic-calendar/Service/getCalendarFilteredData , the table becomes empty , even though the soup variable contains data, not what I'm requesting. What am I doing wrong? How can I save the form on a date of my choosing?

SIM card

You are really close. If I understand your requirements, the following should help you:

import requests
from bs4 import BeautifulSoup

url = "https://www.investing.com/economic-calendar/Service/getCalendarFilteredData"

payload = {"country[]":["25","32","6","37","72","22","17","39","14","10","35","43","56","36","110","11","26","12","4","5"],
                "dateFrom":"2020-02-01",
                "dateTo":"2020-02-05",
                "timeZone":"8",
                "timeFilter":"timeRemain",
                "currentTab":"custom",
                "limit_from":"0"}

req = requests.post(url, data=payload, headers={
    "User-Agent":"Mozilla/5.0",
    "X-Requested-With": "XMLHttpRequest"
    })
soup = BeautifulSoup(req.json()['data'],"lxml")
for items in soup.select("tr"):
    data = [item.get_text(strip=True) for item in items.select("th,td")]
    print(data)

Related


Should I use GET or POST when requesting sensitive data?

JJ Zabkar Should I use GETor for retrieving sensitive data given the following :POST The response will contain sensitive data. The request has side effects (for example, explicit accountability logging). In RFC 2616 , for me, don't clarify this for me: 9.1.1 S

Should I use GET or POST when requesting sensitive data?

JJ Zabkar Should I use GETor for retrieving sensitive data given the following :POST The response will contain sensitive data. The request has side effects (for example, explicit accountability logging). In RFC 2616 , for me, don't clarify this for me: 9.1.1 S

how do i get data from the table

Ruben Can someone assist me how to get data from a table in Excel 2013? I have a table called Personal: I want to have a control in a cell like this: (I don't know how to add this control in a cell) So I can select data from the table: I googled a lot before a

how do i get data from the table

Ruben Can someone assist me how to get data from a table in Excel 2013? I have a table called Personal: I want to have a control in a cell like this: (I don't know how to add this control in a cell) So I can select data from the table: I googled a lot before a

How do I get the python requests package to work?

grace period I can't seem to install the requests package properly. I get the same error no matter what program I actually use to run: Traceback (most recent call last): File "/Users/garce/Desktop/songlyrics/getlyrics.py", line 2, in <module> import requests M

How do I get the python requests package to work?

grace period I can't seem to install the requests package properly. I get the same error no matter what program I actually use to run: Traceback (most recent call last): File "/Users/garce/Desktop/songlyrics/getlyrics.py", line 2, in <module> import requests M

How do I get the python requests package to work?

grace period I can't seem to install the requests package properly. I get the same error no matter what program I actually use to run: Traceback (most recent call last): File "/Users/garce/Desktop/songlyrics/getlyrics.py", line 2, in <module> import requests M

How do I get the python requests package to work?

grace period I can't seem to install the requests package properly. I get the same error no matter what program I actually use to run: Traceback (most recent call last): File "/Users/garce/Desktop/songlyrics/getlyrics.py", line 2, in <module> import requests M

How do I get the python requests package to work?

grace period I can't seem to install the requests package properly. I get the same error no matter what program I actually use to run: Traceback (most recent call last): File "/Users/garce/Desktop/songlyrics/getlyrics.py", line 2, in <module> import requests M

Nancy (C#): How do I get my post data?

Westerlund I am using Corona SDK to post data to my C# server: headers["Content-Type"] = "application/x-www-form-urlencoded" headers["Accept-Language"] = "en-US" local body = "color=red&size=small" local params = {} params.headers = headers params.body = bod

Nancy (C#): How do I get my post data?

Westerlund I am using Corona SDK to post data to my C# server: headers["Content-Type"] = "application/x-www-form-urlencoded" headers["Accept-Language"] = "en-US" local body = "color=red&size=small" local params = {} params.headers = headers params.body = bod

Nancy (C#): How do I get my post data?

Westerlund I am using Corona SDK to post data to my C# server: headers["Content-Type"] = "application/x-www-form-urlencoded" headers["Accept-Language"] = "en-US" local body = "color=red&size=small" local params = {} params.headers = headers params.body = bod

How do I get webpack-dev-server to accept POST requests

Kevin Beal In my project I call: $ webpack-dev-server --history-api-fallback And start on the available express server (I assume) localhost:8080. Works great, except that I want to submit the form via POST into the iframe that loads my app; localhost:8080othe

How do I get webpack-dev-server to accept POST requests

Kevin Beal In my project I call: $ webpack-dev-server --history-api-fallback And start on the available express server (I assume) localhost:8080. Works great, except that I want to submit the form via POST into the iframe that loads my app; localhost:8080othe

How do I get webpack-dev-server to accept POST requests

Kevin Beal In my project I call: $ webpack-dev-server --history-api-fallback And start on the available express server (I assume) localhost:8080. Works great, except that I want to submit the form via POST into the iframe that loads my app; localhost:8080othe