1.Create a new .py file and save name as web.py : Show
import webbrowserwith open("urls.txt") as f: 2. Create a “urls.txt” file : https://www.google.com/https://www.youtube.com/ 3.Head to CMD and type : python web.py 4. That’s it, and you can see the browser open the URLS in the urls.txt Source code: Lib/urllib/request.py The See also The Requests package is recommended for a higher-level HTTP client interface. The Open the URL url, which can
be either a string or a data must be an object specifying additional data to be sent to the server, or urllib.request module
uses HTTP/1.1 and includes The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This actually only works for HTTP, HTTPS and FTP connections. If context is specified, it must be a The optional cafile and capath parameters specify a set of trusted CA certificates for HTTPS requests. cafile should point to a single file containing a bundle of CA certificates, whereas capath should point to a directory of hashed
certificate files. More information can be found in The cadefault parameter is ignored. This function always returns an object which can work as a context manager and has the properties url,
headers, and status. See For HTTP and HTTPS URLs, this function returns a For FTP, file, and data URLs and requests explicitly handled by legacy Raises Note that In addition, if proxy settings are detected (for example, when a The legacy The default opener raises an auditing event Changed in version 3.2: cafile and capath were added. Changed in version 3.2: HTTPS virtual hosts are now supported if possible (that is, if New in version 3.2: data can be an iterable object. Changed in version 3.3: cadefault was added. Changed in version 3.4.3: context was added. Changed in version 3.10: HTTPS connection now send an ALPN extension with protocol indicator Install an Return an If the Python installation has SSL support (i.e., if the A Convert the pathname path from the local syntax for a path to the form used in the path component of a URL. This does not produce a complete URL. The return value will already be quoted using the
Convert the path component path from a percent-encoded
URL to the local syntax for a path. This does not accept a complete URL. This function uses This helper function returns a dictionary of scheme to proxy server URL mappings. It scans the environment for variables named Note If the environment variable The following classes are provided: This class is an abstraction of a URL request. url should be a string containing a valid URL. data
must be an object specifying additional data to send to the server, or For an HTTP POST request method, data should be a buffer in the standard application/x-www-form-urlencoded format. The
headers should be a dictionary, and will be treated as if
An appropriate The next two arguments are only of interest for correct handling of third-party HTTP cookies: origin_req_host should be the request-host of the origin transaction, as defined by RFC 2965. It defaults to unverifiable
should indicate whether the request is unverifiable, as defined by RFC 2965. It defaults to method
should be a string that indicates the HTTP request method that will be used (e.g. Note The request will not work as expected if the data object is unable to deliver its content more than once (e.g. a file or an iterable that can produce the
content only once) and the request is retried for HTTP redirects or authentication. The data is sent to the HTTP server right away after the headers. There is no support for a 100-continue expectation in the library. Changed in version 3.3: Changed in
version 3.4: Default Changed in version 3.6: Do not raise an error if the The urllib.request. BaseHandler ¶This is the base class for all registered handlers — and handles only the simple mechanics of registration. classurllib.request. HTTPDefaultErrorHandler ¶A class which defines a default handler for HTTP error responses; all responses are turned into
urllib.request. HTTPRedirectHandler ¶A class to handle redirections. classurllib.request. HTTPCookieProcessor (cookiejar=None)¶A class to handle HTTP Cookies. classurllib.request. ProxyHandler (proxies=None)¶Cause requests to go through a proxy. If proxies is given, it must be a dictionary mapping protocol names to URLs of proxies. The default is to read the list of proxies from the
environment variables To disable autodetected proxy pass an empty dictionary. The class urllib.request. HTTPPasswordMgr ¶Keep a database of urllib.request. HTTPPasswordMgrWithDefaultRealm ¶Keep a database of urllib.request. HTTPPasswordMgrWithPriorAuth ¶A variant of New in version 3.5. classurllib.request. AbstractBasicAuthHandler (password_mgr=None)¶This is a mixin class that helps with HTTP authentication, both to the remote host and to a proxy. password_mgr, if given, should be something that is
compatible with New in version 3.5: Added urllib.request. HTTPBasicAuthHandler (password_mgr=None)¶Handle authentication with the remote host. password_mgr, if given, should be something that is compatible with
urllib.request. ProxyBasicAuthHandler (password_mgr=None)¶Handle authentication with the proxy. password_mgr, if given, should be something that is compatible with urllib.request. AbstractDigestAuthHandler (password_mgr=None)¶This is a mixin class that helps with HTTP authentication, both to the remote host and to a proxy.
password_mgr, if given, should be something that is compatible with urllib.request. HTTPDigestAuthHandler (password_mgr=None)¶Handle authentication with the remote host. password_mgr, if given, should be something that is compatible with
Changed in version 3.3: Raise
urllib.request. ProxyDigestAuthHandler (password_mgr=None)¶Handle authentication with the proxy. password_mgr, if given, should be something that is compatible with
urllib.request. HTTPHandler ¶A class to handle opening of HTTP URLs. classurllib.request. HTTPSHandler (debuglevel=0, context=None,
check_hostname=None)¶A class to handle opening of HTTPS URLs. context and check_hostname have the same meaning as in
Changed in version 3.2: context and check_hostname were added. classurllib.request. FileHandler ¶Open local files. classurllib.request. DataHandler ¶Open data URLs. New in version 3.4. classurllib.request. FTPHandler ¶Open FTP URLs. classurllib.request. CacheFTPHandler ¶Open FTP URLs, keeping a cache of open FTP connections to minimize delays. classurllib.request. UnknownHandler ¶A catch-all class to handle unknown URLs. classurllib.request. HTTPErrorProcessor ¶Process HTTP error responses. Request Objects¶The following methods describe Request. full_url ¶The original URL passed to the constructor. Changed in version 3.4. Request.full_url is a property with setter, getter and a deleter. Getting
Request. type ¶The URI scheme. Request. host ¶The URI authority, typically a host, but may also contain a port separated by a colon. Request. origin_req_host ¶The original host for the request, without port. Request. selector ¶
The URI path. If the Request. data ¶The entity
body for the request, or Changed in version 3.4: Changing value of Request. unverifiable ¶boolean, indicates whether the request is unverifiable as defined by RFC 2965. Request. method ¶The HTTP request method to use. By default its value is New in version 3.3. Changed in version 3.4: A default value can now be set in subclasses; previously it could only be set via the constructor argument. Request. get_method ()¶Return a string indicating the HTTP request method. If
Changed in version 3.3:
get_method now looks at the value of Add another header to the request. Headers are currently ignored by all handlers except HTTP handlers, where they are added to the list of headers sent to the server. Note that there cannot be more than one header with the same name, and later calls will overwrite previous calls in case the key collides. Currently, this is no loss of HTTP functionality, since all headers which have meaning when used more than once have a (header-specific) way of gaining the same functionality using only one header. Note that headers added using this method are also added to redirected requests. Add a header that will not be added to a redirected request. Return whether the instance has the named header (checks both regular and unredirected). Remove named header from the request instance (both from regular and unredirected headers). New in version 3.4. Request. get_full_url ()¶Return the URL given in the constructor. Changed in version 3.4. Returns Request. set_proxy (host,
type)¶Prepare the request by connecting to a proxy server. The host and type will replace those of the instance, and the instance’s selector will be the original URL given in the constructor. Return the value of the given header. If the header is not present, return the default value. Return a list of tuples (header_name, header_value) of the Request headers. Changed in version 3.4: The request methods add_data, has_data, get_data, get_type, get_host, get_selector, get_origin_req_host and is_unverifiable that were deprecated since 3.3 have been removed. OpenerDirector Objects¶
OpenerDirector. add_handler (handler)¶handler should be an instance of
OpenerDirector. open (url, data=None[, timeout])¶Open the given url (which can be a request object or a string), optionally passing the given data.
Arguments, return values and exceptions raised are the same as those of OpenerDirector. error (proto,
*args)¶Handle an error of the given protocol. This will call the registered error handlers for the given protocol with the given arguments (which are protocol specific). The HTTP protocol is a special case which uses the HTTP response code to determine the specific error handler;
refer to the Return values and exceptions raised are the same as those of OpenerDirector objects open URLs in three stages: The order in which these methods are called within each stage is determined by sorting the handler instances.
BaseHandler Objects¶
BaseHandler. add_parent (director)¶Add a director as parent. BaseHandler. close ()¶Remove any parents. The following attribute and methods should only be used by classes derived from
Note The convention has been adopted that subclasses defining BaseHandler. parent ¶A valid BaseHandler. default_open (req)¶This method is not defined in This method, if implemented, will be called by the parent This method will be called before any protocol-specific open method. BaseHandler.<protocol>_open(req)
This method is not defined in This method, if defined, will be called by the parent
BaseHandler. unknown_open (req)¶This method is not defined in This method, if implemented, will be called by the
BaseHandler. http_error_default (req, fp, code, msg,
hdrs)¶This method is not defined in req will be a Return values and exceptions raised should be the same as those of BaseHandler.http_error_<nnn>(req, fp, code, msg, hdrs) nnn should be a three-digit HTTP error code. This method is also not defined in Subclasses should override this method to handle specific HTTP errors. Arguments, return values and exceptions raised should be the same as for BaseHandler.<protocol>_request(req) This method is not defined in This method, if defined, will be called by
the parent BaseHandler.<protocol>_response(req, response) This method is not defined in This method, if defined, will be called by the parent HTTPRedirectHandler Objects¶Note Some HTTP redirections require action from this module’s client code. If this is the case, An HTTPRedirectHandler. redirect_request (req,
fp, code, msg, hdrs, newurl)¶Return a Note The default implementation of this method does not strictly follow RFC 2616, which says that 301
and 302 responses to HTTPRedirectHandler. http_error_301 (req, fp, code, msg,
hdrs)¶Redirect to the HTTPRedirectHandler. http_error_302 (req, fp, code, msg, hdrs)¶The same as
HTTPRedirectHandler. http_error_303 (req, fp, code, msg,
hdrs)¶The same as HTTPRedirectHandler. http_error_307 (req, fp, code, msg, hdrs)¶The same as
HTTPCookieProcessor Objects¶
HTTPCookieProcessor. cookiejar ¶The ProxyHandler Objects¶ProxyHandler.<protocol>_open(request) The HTTPPasswordMgr Objects¶These methods are available on
HTTPPasswordMgr. add_password (realm,
uri, user, passwd)¶uri can be either a single URI, or a sequence of URIs. realm, user and passwd must be strings. This causes HTTPPasswordMgr. find_user_password (realm, authuri)¶Get user/password for given realm and URI, if any. This method will
return For HTTPPasswordMgrWithPriorAuth Objects¶This password manager extends HTTPPasswordMgrWithPriorAuth. add_password (realm, uri, user, passwd, is_authenticated=False)¶realm,
uri, user, passwd are as for HTTPPasswordMgrWithPriorAuth. find_user_password (realm, authuri)¶Same as for
HTTPPasswordMgrWithPriorAuth. update_authenticated (self, uri,
is_authenticated=False)¶Update the HTTPPasswordMgrWithPriorAuth. is_authenticated (self,
authuri)¶Returns the current state of the AbstractBasicAuthHandler Objects¶AbstractBasicAuthHandler. http_error_auth_reqed (authreq, host, req,
headers)¶Handle an authentication request by getting a user/password pair, and re-trying the request. authreq should be the name of the header where the information about the realm is included in the request, host specifies the URL and path to
authenticate for, req should be the (failed) host is either an authority (e.g. HTTPBasicAuthHandler Objects¶HTTPBasicAuthHandler. http_error_401 (req, fp, code,
msg, hdrs)¶Retry the request with authentication information, if available. ProxyBasicAuthHandler Objects¶ProxyBasicAuthHandler. http_error_407 (req, fp, code, msg,
hdrs)¶Retry the request with authentication information, if available. AbstractDigestAuthHandler Objects¶AbstractDigestAuthHandler. http_error_auth_reqed (authreq, host, req,
headers)¶authreq should be the name of the header where the information about the realm is included in the request, host should be the host to authenticate to, req should be the (failed)
HTTPDigestAuthHandler Objects¶HTTPDigestAuthHandler. http_error_401 (req, fp, code, msg, hdrs)¶Retry the request with authentication information, if available. ProxyDigestAuthHandler Objects¶ProxyDigestAuthHandler. http_error_407 (req, fp, code,
msg, hdrs)¶Retry the request with authentication information, if available. HTTPHandler Objects¶HTTPHandler. http_open (req)¶Send an HTTP request, which can be either GET or POST,
depending on HTTPSHandler Objects¶HTTPSHandler. https_open (req)¶Send an HTTPS request, which can be either GET or POST, depending on FileHandler Objects¶FileHandler. file_open (req)¶Open the file locally, if there is no host name, or the
host name is Changed in version 3.2: This method is applicable only for local hostnames. When a remote hostname is given, an DataHandler Objects¶DataHandler. data_open (req)¶Read a data URL. This kind of URL contains the content
encoded in the URL itself. The data URL syntax is specified in RFC 2397. This implementation ignores white spaces in base64 encoded data URLs so the URL may be wrapped in whatever source file it comes from. But even though some browsers don’t mind about a missing padding at the end of a base64 encoded data URL, this implementation will raise an
FTPHandler Objects¶FTPHandler. ftp_open (req)¶Open the FTP file indicated by req. The login is always done with empty username and password. CacheFTPHandler Objects¶
CacheFTPHandler. setTimeout (t)¶Set timeout of connections to t seconds. CacheFTPHandler. setMaxConns (m)¶Set maximum number of cached connections to m. UnknownHandler Objects¶UnknownHandler. unknown_open ()¶Raise a
HTTPErrorProcessor Objects¶HTTPErrorProcessor. http_response (request, response)¶Process HTTP error responses. For 200 error codes, the response object is returned immediately. For non-200 error codes, this simply passes the job
on to the HTTPErrorProcessor. https_response (request,
response)¶Process HTTPS error responses. The behavior is same as Examples¶In addition to the examples below, more examples are given in HOWTO Fetch Internet Resources Using The urllib Package. This example gets the python.org main page and displays the first 300 bytes of it. >>> import urllib.request >>> with urllib.request.urlopen('http://www.python.org/') as f: ... print(f.read(300)) ... b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n <title>Python Programming ' Note that urlopen returns a bytes object. This is because there is no way for urlopen to automatically determine the encoding of the byte stream it receives from the HTTP server. In general, a program will decode the returned bytes object to string once it determines or guesses the appropriate encoding. The following W3C document, https://www.w3.org/International/O-charset, lists the various ways in which an (X)HTML or an XML document could have specified its encoding information. As the python.org website uses utf-8 encoding as specified in its meta tag, we will use the same for decoding the bytes object. >>> with urllib.request.urlopen('http://www.python.org/') as f: ... print(f.read(100).decode('utf-8')) ... <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm It is also possible to achieve the same result without using the context manager approach. >>> import urllib.request >>> f = urllib.request.urlopen('http://www.python.org/') >>> print(f.read(100).decode('utf-8')) <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm In the following example, we are sending a data-stream to the stdin of a CGI and reading the data it returns to us. Note that this example will only work when the Python installation supports SSL. >>> import urllib.request >>> req = urllib.request.Request(url='https://localhost/cgi-bin/test.cgi', ... data=b'This data is passed to stdin of the CGI') >>> with urllib.request.urlopen(req) as f: ... print(f.read().decode('utf-8')) ... Got Data: "This data is passed to stdin of the CGI" The code for the sample CGI used in the above example is: #!/usr/bin/env python import sys data = sys.stdin.read() print('Content-type: text/plain\n\nGot Data: "%s"' % data) Here is an example of doing a import urllib.request DATA = b'some data' req = urllib.request.Request(url='http://localhost:8080', data=DATA, method='PUT') with urllib.request.urlopen(req) as f: pass print(f.status) print(f.reason) Use of Basic HTTP Authentication: import urllib.request # Create an OpenerDirector with support for Basic HTTP Authentication... auth_handler = urllib.request.HTTPBasicAuthHandler() auth_handler.add_password(realm='PDQ Application', uri='https://mahler:8092/site-updates.py', user='klem', passwd='kadidd!ehopper') opener = urllib.request.build_opener(auth_handler) # ...and install it globally so it can be used with urlopen. urllib.request.install_opener(opener) urllib.request.urlopen('http://www.example.com/login.html')
This example replaces the default proxy_handler = urllib.request.ProxyHandler({'http': 'http://www.example.com:3128/'}) proxy_auth_handler = urllib.request.ProxyBasicAuthHandler() proxy_auth_handler.add_password('realm', 'host', 'username', 'password') opener = urllib.request.build_opener(proxy_handler, proxy_auth_handler) # This time, rather than install the OpenerDirector, we use it directly: opener.open('http://www.example.com/login.html') Adding HTTP headers: Use the headers argument to the import urllib.request req = urllib.request.Request('http://www.example.com/') req.add_header('Referer', 'http://www.python.org/') # Customize the default User-Agent header value: req.add_header('User-Agent', 'urllib-example/0.1 (Contact: . . .)') r = urllib.request.urlopen(req)
import urllib.request opener = urllib.request.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] opener.open('http://www.example.com/') Also, remember that a few standard headers (, and ) are added when the
Here is an example session that uses the >>> import urllib.request >>> import urllib.parse >>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) >>> url = "http://www.musi-cal.com/cgi-bin/query?%s" % params >>> with urllib.request.urlopen(url) as f: ... print(f.read().decode('utf-8')) ... The following example uses the >>> import urllib.request >>> import urllib.parse >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) >>> data = data.encode('ascii') >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f: ... print(f.read().decode('utf-8')) ... The following example uses an explicitly specified HTTP proxy, overriding environment settings: >>> import urllib.request >>> proxies = {'http': 'http://proxy.example.com:8080/'} >>> opener = urllib.request.FancyURLopener(proxies) >>> with opener.open("http://www.python.org") as f: ... f.read().decode('utf-8') ... The following example uses no proxies at all, overriding environment settings: >>> import urllib.request >>> opener = urllib.request.FancyURLopener({}) >>> with opener.open("http://www.python.org/") as f: ... f.read().decode('utf-8') ... Legacy interface¶The following functions and classes are ported from the Python 2 module urllib.request. urlretrieve (url, filename=None, reporthook=None, data=None)¶Copy a network object denoted by a URL to a
local file. If the URL points to a local file, the object will not be copied unless filename is supplied. Return a tuple The second argument, if present, specifies the file location to copy to (if absent, the location will be a tempfile with a generated name). The third argument, if present, is a callable that will be called once on establishment of the network connection and once after each block read thereafter. The callable will be passed three arguments;
a count of blocks transferred so far, a block size in bytes, and the total size of the file. The third argument may be The following example illustrates the most common usage scenario: >>> import urllib.request >>> local_filename, headers = urllib.request.urlretrieve('http://python.org/') >>> html = open(local_filename) >>> html.close() If the url uses the
The Content-Length is treated as a lower bound: if there’s more data to read, urlretrieve reads more data, but if less data is available, it raises the exception. You can still retrieve the downloaded data in this case, it is stored in the If no Content-Length header was supplied, urlretrieve can not check the size of the data it has downloaded, and just returns it. In this case you just have to assume that the download was successful. urllib.request. urlcleanup ()¶Cleans up temporary files that may have
been left behind by previous calls to urllib.request. URLopener (proxies=None,
**x509)¶Deprecated since version 3.3. Base class for opening and reading URLs. Unless you need to support opening objects using schemes other than By default, the The optional proxies parameter should be a dictionary mapping
scheme names to proxy URLs, where an empty dictionary turns proxies off completely. Its default value is Additional keyword parameters, collected in x509, may be used for authentication of the client when using the
open (fullurl, data=None)¶Open fullurl using the appropriate protocol. This method sets up cache and proxy information, then calls the
appropriate open method with its input arguments. If the scheme is not recognized, This method always quotes fullurl using open_unknown (fullurl,
data=None)¶Overridable interface to open unknown URL types. retrieve (url, filename=None,
reporthook=None, data=None)¶Retrieves the contents of url and places it in filename. The return value is a tuple consisting of a local filename and either an
If the url uses the version ¶Variable that specifies the user agent of the opener object. To get urllib.request. FancyURLopener (...)¶Deprecated since version 3.3.
For all other response codes, the method Note According to the letter of
RFC 2616, 301 and 302 responses to POST requests must not be automatically redirected without confirmation by the user. In reality, browsers do allow automatic redirection of these responses, changing the POST to a GET, and The parameters to the constructor are the same as those for Note When performing basic authentication, a The
prompt_user_passwd (host,
realm)¶Return information needed to authenticate the user at the given host in the specified security realm. The return value should be a tuple, The implementation prompts for this information on the terminal; an application should override this method to use an appropriate interaction model in the local environment. urllib.request Restrictions¶
urllib.response — Response classes used by urllib¶The
urllib.response. addinfourl ¶ url ¶URL of the resource retrieved, commonly used to determine if a redirect was followed. Returns the headers of the response in the form of an status ¶
New in version 3.9. Status code returned by server. geturl ()¶Deprecated since version 3.9: Deprecated in favor of
info ()¶Deprecated since version 3.9: Deprecated in favor of
code ¶Deprecated since version 3.9: Deprecated in favor of
getstatus ()¶Deprecated since version 3.9: Deprecated in
favor of How do I open a text file in a URL in Python?Use urllib.. url = "http://textfiles.com/adventure/aencounter.txt". file = urllib. request. urlopen(url). for line in file:. decoded_line = line. decode("utf-8"). print(decoded_line). How read data directly from URL in Python?The basic idea. req=urllib. request. Request(url) : creates a Request object specifying the URL we want.. resp=urllib. request. urlopen(resp) : returns a response object from the server for the requested URL.. data=resp. read() : the response object ( resp ) is file-like, which means we can read it.. How do you open a file over the network at a specified URL in Python?open file in python network url. from requests_testadapter import Resp.. class LocalFileAdapter(requests. adapters. HTTPAdapter):. def build_response_from_file(self, request):. file_path = request. url[7:]. with open(file_path, 'rb') as file:. buff = bytearray(os. path. getsize(file_path)). file. readinto(buff). How do I open a URL in Python?We are going to use the following methods in this section to open a given url using a Python program: Using Urllib library function. Using webbrowser library function.. # Importing urllib request module in the program.. import urllib. request.. # Using urlopen() function with url in it.. |