1
- PHP Domain Parser
2
- =================
1
+ # PHP Domain Parser
3
2
4
3
** PHP Domain Parser** is a [ Public Suffix List] ( http://publicsuffix.org/ ) based
5
4
domain parser implemented in PHP.
6
5
7
6
[ ![ Build Status] ( https://travis-ci.org/jeremykendall/php-domain-parser.png?branch=master )] ( https://travis-ci.org/jeremykendall/php-domain-parser )
8
- [ ![ SensioLabsInsight] ( https://insight.sensiolabs.com/projects/13310245-48b5-43a2-ac30-269e059305e1/mini.png )] ( https://insight.sensiolabs.com/projects/13310245-48b5-43a2-ac30-269e059305e1 )
9
7
[ ![ Total Downloads] ( https://poser.pugx.org/jeremykendall/php-domain-parser/downloads.png )] ( https://packagist.org/packages/jeremykendall/php-domain-parser )
10
8
[ ![ Latest Stable Version] ( https://poser.pugx.org/jeremykendall/php-domain-parser/v/stable.png )] ( https://packagist.org/packages/jeremykendall/php-domain-parser )
11
9
12
- Motivation
13
- ----------
10
+ ## Motivation
14
11
15
12
While there are plenty of excellent URL parsers and builders available, there
16
13
are very few projects that can accurately parse a url into its component
@@ -20,13 +17,12 @@ Consider the domain www.pref.okinawa.jp. In this domain, the
20
17
* public suffix* portion is ** okinawa.jp** , the * registerable domain* is
21
18
** pref.okinawa.jp** , and the * subdomain* is ** www** . You can't regex that.
22
19
23
- Other similar libraries focus primarily on URL building, parsing, and manipulation and
24
- additionally include public suffix domain parsing. PHP Domain Parser was built around
25
- accurate Public Suffix List based parsing from the very beginning, adding a URL
26
- object simply for the sake of completeness.
20
+ Other similar libraries focus primarily on URL building, parsing, and
21
+ manipulation and additionally include public suffix domain parsing. PHP Domain
22
+ Parser was built around accurate Public Suffix List based parsing from the very
23
+ beginning, adding a URL object simply for the sake of completeness.
27
24
28
- Installation
29
- ------------
25
+ ## Installation
30
26
31
27
The only (currently) supported method of installation is via
32
28
[ Composer] ( http://getcomposer.org ) .
@@ -53,10 +49,9 @@ require_once 'vendor/autoload.php'
53
49
54
50
You're now ready to begin using the PHP Domain Parser.
55
51
56
- Usage
57
- -----
52
+ ## Usage
58
53
59
- ### Parsing URLs ###
54
+ ### Parsing URLs
60
55
61
56
Parsing URLs into their component parts is as simple as the example you see below.
62
57
@@ -104,8 +99,9 @@ class Pdp\Uri\Url#6 (8) {
104
99
105
100
### Convenience Methods
106
101
107
- A magic __ get() method is provided to access the above object properties. Obtaining the
108
- public suffix for a parsed domain is as simple as:
102
+ A magic [ ` __get() ` ] ( http://php.net/manual/en/language.oop5.overloading.php#object.get )
103
+ method is provided to access the above object properties. Obtaining the public
104
+ suffix for a parsed domain is as simple as:
109
105
110
106
``` php
111
107
<?php
@@ -120,8 +116,8 @@ $publicSuffix = $url->host->publicSuffix;
120
116
### IDNA Support
121
117
122
118
[ IDN (Internationalized Domain Name)] ( http://en.wikipedia.org/wiki/Internationalized_domain_name )
123
- support was added in version ` 1.4.0 ` . Both unicode domains and their ASCII equivalents
124
- are supported.
119
+ support was added in version ` 1.4.0 ` . Both unicode domains and their ASCII
120
+ equivalents are supported.
125
121
126
122
** IMPORTANT** :
127
123
@@ -132,13 +128,13 @@ required for [mb_strtolower](http://php.net/manual/en/function.mb-strtolower.php
132
128
133
129
#### Unicode
134
130
135
- Parsing IDNA hosts is no different that parsing standard hosts. Setting ` $host = 'Яндекс.РФ'; ` (Russian-Cyrillic)
136
- in the * Parsing URLs* example would return:
131
+ Parsing IDNA hosts is no different that parsing standard hosts. Setting `$host
132
+ = 'Яндекс.РФ';` (Russian-Cyrillic) in the * Parsing URLs* example would return:
137
133
138
134
```
139
135
class Pdp\Uri\Url#6 (8) {
140
136
private $scheme =>
141
- string(4 ) "http "
137
+ string(0 ) ""
142
138
private $host =>
143
139
class Pdp\Uri\Url\Host#5 (4) {
144
140
private $subdomain =>
@@ -203,7 +199,8 @@ class Pdp\Uri\Url#6 (8) {
203
199
204
200
### IPv6 Support
205
201
206
- Parsing IPv6 hosts is no different that parsing standard hosts. Setting ` $host = 'http://[2001:db8:85a3:8d3:1319:8a2e:370:7348]:8080/'; `
202
+ Parsing IPv6 hosts is no different that parsing standard hosts. Setting `$host
203
+ = 'http://[ 2001:db8:85a3:8d3:1319:8a2e:370:7348] :8080/';`
207
204
in the * Parsing URLs* example would return:
208
205
209
206
```
@@ -242,7 +239,7 @@ will not be parsed properly otherwise.
242
239
> Hat tip to [ @geekwright ] ( https://github.com/geekwright ) for adding IPv6 support in a
243
240
> [ bugfix pull request] ( https://github.com/jeremykendall/php-domain-parser/pull/35 ) .
244
241
245
- ### Parsing Domains ###
242
+ ### Parsing Domains
246
243
247
244
If you'd like to parse the domain (or host) portion only, you can use
248
245
` Parser::parseHost() ` .
@@ -279,7 +276,8 @@ var_dump($parser->isSuffixValid('www.example.com.au');
279
276
// true
280
277
```
281
278
282
- A suffix is considered invalid if it is not contained in the [ Public Suffix List] ( http://publicsuffix.org/ ) .
279
+ A suffix is considered invalid if it is not contained in the
280
+ [ Public Suffix List] ( http://publicsuffix.org/ ) .
283
281
284
282
> Huge thanks to [ @SmellyFish ] ( https://github.com/SmellyFish ) for submitting
285
283
> [ Add a way to validate TLDs] ( https://github.com/jeremykendall/php-domain-parser/pull/36 )
@@ -306,7 +304,7 @@ string(16) "scottwills.co.uk"
306
304
string(5) "co.uk"
307
305
```
308
306
309
- ### Sanity Check ###
307
+ ### Sanity Check
310
308
311
309
You can quickly parse a url from the command line with the provided ` parse `
312
310
vendor binary. From the root of your project, simply call:
@@ -318,7 +316,8 @@ $ ./vendor/bin/parse <url>
318
316
If you pass a url to ` parse ` , that url will be parsed and the output printed
319
317
to screen.
320
318
321
- If you do not pass a url, ` http://user:pass@www.pref.okinawa.jp:8080/path/to/page.html?query=string#fragment ` will be parsed and the output printed to screen.
319
+ If you do not pass a url, ` http://user:pass@www.pref.okinawa.jp:8080/path/to/page.html?query=string#fragment `
320
+ will be parsed and the output printed to screen.
322
321
323
322
Example:
324
323
@@ -342,12 +341,12 @@ Array
342
341
Host: http://www.waxaudio.com.au/
343
342
```
344
343
345
- ### Example Script ###
344
+ ### Example Script
346
345
347
346
For more information on using the PHP Domain Parser, please see the provided
348
347
[ example script] ( https://github.com/jeremykendall/php-domain-parser/blob/master/example.php ) .
349
348
350
- ### Refreshing the Public Suffix List ###
349
+ ### Refreshing the Public Suffix List
351
350
352
351
While a cached PHP copy of the Public Suffix List is provided for you in the
353
352
` data ` directory, that copy may or may not be up to date (Mozilla provides an
@@ -358,22 +357,32 @@ refresh your cached copy of the Public Suffix List.
358
357
From the root of your project, simply call:
359
358
360
359
``` bash
361
- $ ./vendor/bin/pdp -psl
360
+ $ ./vendor/bin/update -psl
362
361
```
363
362
364
363
You may verify the update by checking the timestamp on the files located in the
365
364
` data ` directory.
366
365
367
- ** Important** : The vendor binary ` pdp -psl` depends on an internet connection to
366
+ ** Important** : The vendor binary ` update -psl` depends on an internet connection to
368
367
update the cached Public Suffix List.
369
368
370
- Contributing
371
- ------------
369
+ ## Possible Unexpected Behavior
370
+
371
+ PHP Domain Parser is built around PHP's
372
+ [ ` parse_url() ` ] ( http://php.net/parse_url ) function and, as such, exhibits most
373
+ of the same behaviors of that function. Just like ` parse_url() ` , this library
374
+ is not meant to validate URLs, but rather to break a URL into its component
375
+ parts.
376
+
377
+ One specific, counterintuitive behavior is that PHP Domain Parser will happily
378
+ parse a URL with [ spaces in the host part] ( https://github.com/jeremykendall/php-domain-parser/issues/45 ) .
379
+
380
+ ## Contributing
372
381
373
382
Pull requests are * always* welcome! Please review the CONTRIBUTING.md document before
374
383
submitting pull requests.
375
384
376
- #### Heads up: BC Break In All 1.4 Versions
385
+ ## Heads up: BC Break In All 1.4 Versions
377
386
378
387
The 1.4 series introduced a backwards incompatible change by adding PHP's ` ext-mbstring `
379
388
and ` ext-intl ` as dependencies. This should have resulted in a major version
@@ -383,11 +392,10 @@ I highly recommend reverting to 1.3.1 if you're running into extension issues an
383
392
do not want to or cannot install ` ext-mbstring ` and ` ext-intl ` . You will lose
384
393
IDNA and IPv6 support, however. Those are only available in versions >= 1.4.
385
394
386
- Version 2 is currently in the works. Please keep an eye out. I apologize for any
387
- issues you may have encountered due my [ semver] ( http://semver.org/ ) error.
395
+ I apologize for any issues you may have encountered due my
396
+ [ semver] ( http://semver.org/ ) error.
388
397
389
- Attribution
390
- -----------
398
+ ## Attribution
391
399
392
400
The HTTP adapter interface and the cURL HTTP adapter were inspired by (er,
393
401
lifted from) Will Durand's excellent
0 commit comments