Asked  7 Months ago    Answers:  5   Viewed   40 times

I would like to make a php script that can capture a page from a website. Think file_get_contents($url).

However, this website requires that you fill in a username/password log-in form before you can access any page. I imagine that once logged-in, the website sends your browser an authentication cookie and with every consequent browser request, the session info is passed back to the website to authenticate access.

I want to know how i can simulate this behavior of the browser with a php script in order to gain access and capture a page from this website.

More specifically, my questions are:

  1. How do I send a request that contains my log-in details so that the website replies with the session information/cookie
  2. How do i read the session information/cookie
  3. How do i pass back this session information with every consequent request (file_get_contents, curl) to the website.

Thanks.

 Answers

35

Curl is pretty well suited to do it. You don't need to do anything special other than set the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE options. Once you've logged in by passing the form fields from the site the cookie will be saved and Curl will use that same cookie for subsequent requests automatically as the example below illustrates.

Note that the function below saves the cookies to cookies/cookie.txt so make sure that directory/file exists and can be written to.

$loginUrl = 'http://example.com/login'; //action from the login form
$loginFields = array('username'=>'user', 'password'=>'pass'); //login form field names and values
$remotePageUrl = 'http://example.com/remotepage.html'; //url of the page you want to save  

$login = getUrl($loginUrl, 'post', $loginFields); //login to the site

$remotePage = getUrl($remotePageUrl); //get the remote page

function getUrl($url, $method='', $vars='') {
    $ch = curl_init();
    if ($method == 'post') {
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $vars);
    }
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies/cookies.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies/cookies.txt');
    $buffer = curl_exec($ch);
    curl_close($ch);
    return $buffer;
}
Wednesday, March 31, 2021
 
e_i_pi
answered 7 Months ago
80

I think curl is more secure because if you're working with remote file with file_get_contents() you need to enable ‘allow_url_fopen’

reference :
http://25labs.com/alternative-for-file_get_contents-using-curl/
http://phpsec.org/projects/phpsecinfo/tests/allow_url_fopen.html

And continuing discussion from the comments in the question, yes cURL give you more option and if you want to check more you can see it in the documentation here
For file_get_contents() it just a simple GET request.

Saturday, May 29, 2021
 
Juriy
answered 5 Months ago
100

file_get_contents() is a simple screwdriver. Great for simple GET requests where the header, HTTP request method, timeout, cookiejar, redirects, and other important things do not matter.

fopen() with a stream context or cURL with setopt are powerdrills with every bit and option you can think of.

Wednesday, June 2, 2021
 
EastSw
answered 5 Months ago
93

As of June 2011, Windows Live ID supports OAUTH 2.0 and should enable you to do that (read more about it). WPF code example can be found at https://github.com/liveservices/LiveSDK/tree/master/Samples/CSharpDesktop.

Sunday, August 15, 2021
 
sophie
answered 2 Months ago
32

For these kinds of things I use expect.

You need to install expect first. If you're on Ubuntu run sudo apt-get install expect

Then in a script, let's call it heroku_login.exp, enter this with the relevant information:

#!/usr/bin/expect
spawn heroku "login"

expect "Email:"

send "YOUREMAIL";

send "r"

expect "Password (typing will be hidden):"

send "YOURPASSWORD"

send "r"

interact

Then run expect heroku_login.exp and you should be good to go.

Saturday, August 28, 2021
 
ranhan
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :