Definition and Usage
- PHP Version
- 4+
The preg_split() function
splits a string into an array based on a specified regular expression pattern.
The original string remains unchanged.
This function is particularly useful for tasks such as leveraging regular expressions for string splitting and URL parsing.
Basic Example
/* Example of splitting strings into letters and numbers using a regular expression */
$string = 'Hello123';
$parts_string = preg_split('/(?<=\D)(?=\d)|(?<=\d)(?=\D)/', $string); // Splits letters and numbers
// The criteria for the specified regular expression pattern are the boundaries
// where a letter changes to a digit, or where a digit changes to a letter.
print_r($parts_string);
/* Output:
Array (
[0] => Hello
[1] => 123
)
*/
echo 'Letter: ' . $parts_string[0]; // Output: 'Letter: Hello'
echo 'Number: ' . $parts_string[1]; // Output: 'Number: 123'
/* Simple example of extracting the protocol from a URL */
$url = "https://www.example.com/path/to/resource";
$parts_url = preg_split('#://#', $url);
print_r($parts_url);
/* Output:
Array (
[0] => https
[1] => www.example.com/path/to/resource
)
*/
echo 'Protocol type: ' . $parts_url[0]; // Output: 'Protocol type: https'
Double-check!
If you are splitting a string based on a simple delimiter without the need for the powerful features of regular expression patterns, using the explode() or str_split() function is simpler and offers better performance.
Syntax
preg_split(
string $pattern,
string $subject,
int $limit = -1,
int $flags = 0
): array|false
/* preg_split(
The regular expression pattern,
The subject string to be split[,
The maximum number of substrings to be returned[,
The flags specifying additional settings]]
);
*/
Parameters
$pattern |
Required. The regular expression pattern used for splitting. |
|---|---|
$subject |
Required. The subject string to be split. |
$limit |
Optional. The maximum number of substrings to be returned. The default value is -1, which means no limit. |
$flags |
Optional. Specifies additional constant options for the regular expression matching. The default value is 0. Can be combined with the bitwise OR (|) operator.
|
Return Values
Returns an array of strings split according to the specified regular expression pattern.
The function returns false on failure.
Return Value Examples Based on the $flags Parameter
PREG_SPLIT_NO_EMPTY
Excludes empty strings from the result. When this flag is used, any empty strings produced will not be included in the array.
PREG_SPLIT_NO_EMPTY flag
$str = 'This,is,a,,string';
/* Splits the given string by a comma (,) and includes empty strings in the result. */
$parts = preg_split('/,/', $str);
print_r($parts);
/* Output:
Array (
[0] => This
[1] => is
[2] => a
[3] =>
[4] => string
)
*/
/* Splits the given string by a comma (,) and excludes empty strings from the result. */
$no_empty_parts = preg_split('/,/', $str, -1, PREG_SPLIT_NO_EMPTY);
print_r($no_empty_parts);
/* Output:
Array (
[0] => This
[1] => is
[2] => a
[3] => string
)
*/
PREG_SPLIT_DELIM_CAPTURE
When this flag is used, if the delimiter contains a parenthesized expression (), the delimiter will also be captured and included in the result.
PREG_SPLIT_DELIM_CAPTURE flag
$str = 'This,is,a,,string';
// Using the PREG_SPLIT_DELIM_CAPTURE flag
$str = 'This,is,a,,string';
/* Splits the given string by a comma (,) and includes the comma (,) in the result. */
$with_comma_parts = preg_split('/(,)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($with_comma_parts);
/* Output:
Array (
[0] => This
[1] => ,
[2] => is
[3] => ,
[4] => a
[5] => ,
[6] =>
[7] => ,
[8] => string
)
*/
PREG_SPLIT_OFFSET_CAPTURE
Includes the start position of each substring in the result array. When this flag is used, each element in the result array will consist of a pair containing the string and its start position.
PREG_SPLIT_OFFSET_CAPTURE flag
// Using the PREG_SPLIT_OFFSET_CAPTURE flag
$str = 'This,is,a,,string';
/* Splits the given string by a comma (,) and
each element in the result array consists of a (string, start position) pair. */
$with_offset_parts = preg_split('/,/', $str, -1, PREG_SPLIT_OFFSET_CAPTURE);
print_r($with_offset_parts);
/* Output:
Array (
[0] => Array (
[0] => This
[1] => 0
)
[1] => Array (
[0] => is
[1] => 5
)
[2] => Array (
[0] => a
[1] => 8
)
[3] => Array (
[0] =>
[1] => 10
)
[4] => Array (
[0] => string
[1] => 11
)
)
*/
Practical Examples
The following examples demonstrate how the preg_split() function behaves in various scenarios.
Parsing a CSV File
The preg_split() function can be used to extract data from a Comma-Separated Values (CSV) file. For instance, suppose we have the following CSV string:
$csv_data = "John,Doe,25\nJane,Smith,30\n";
You can split the data based on the comma (,) and the newline character (\n) as follows:
$lines = preg_split('/\n/', $csv_data, -1, PREG_SPLIT_NO_EMPTY);
$csv_array = [];
foreach ($lines as $line) {
$csv_array[] = preg_split('/,/', $line, -1, PREG_SPLIT_NO_EMPTY);
}
print_r($csv_array);
/* Output:
Array (
[0] => Array (
[0] => John
[1] => Doe
[2] => 25
)
[1] => Array (
[0] => Jane
[1] => Smith
[2] => 30
)
)
*/
Code Explanation
The foreach() loop is a fundamental construct used to iterate over arrays and objects for repetitive processing.
Parsing a URL
The preg_split() function is useful for extracting the protocol, host, and path from a URL.
$url = 'https://www.example.com/path/to/resource';
$url_parts = preg_split('#://|/#', $url, -1, PREG_SPLIT_NO_EMPTY);
print_r($url_parts);
/* Output:
Array (
[0] => https
[1] => www.example.com
[2] => path
[3] => to
[4] => resource
)
*/
As shown in the code example above, the preg_split() function can be effective for splitting a URL using regular expressions. It is particularly useful when you need to divide a URL into detailed elements based on a specific pattern.
Note that there is a dedicated function for parsing a URL into its individual components: the parse_url() function. This function easily separates and returns the URL as an associative array of its constituent parts. This method is highly convenient when you want to independently obtain the protocol, host, path, and other elements from a URL.
Additional Explanation
The parse_url() function parses a given URL string and returns its components.
It identifies and returns URL components such as scheme, host, port, user, pass, path, query, and fragment.
The following code is a simple example of using the parse_url() function to return a URL separated by its components.
$url = 'https://www.example.com/path/to/resource?query=string&foo=bar';
// Return the URL separated into its components
$url_components = parse_url($url);
// Output the result
print_r($url_components);
/* Output:
Array (
[scheme] => https
[host] => www.example.com
[path] => /path/to/resource
[query] => query=string&foo=bar
)
*/
Based on the two URL parsing examples above, it is clear that choosing the appropriate method depending on the situation is important. For simple URL parsing, it is preferable to use the parse_url() function, while the preg_split() function can be considered when more complex partitioning is required.
References
See also
- PHP preg_replace() Function – Replace Text Using Regular Expressions
- PHP preg_match() Function – Check if a String Matches a Regular Expression
- PHP preg_match_all() Function – Find All Matches of a Regular Expression
- PHP is_string() Function – Checking Whether a Value Is a String
- PHP str_split() Function – Splitting a String into an Array by a Specified Length