Concept of the substr()
Function
The substr()
function slices a string starting from a specified position (offset) and for a specified length, returning the extracted substring.
echo substr('Hello world!', 6, 5);
// Extracts 5 characters starting at position 6 (zero-based index)
// Output: world
Note:
When working with arrays, the array_slice()
function extracts a portion of an array based on a specified range, returning a new array containing the selected elements.
Syntax
substr(string $string, int $start, ?int $length = null): string
Parameters
$string |
Required. The original string from which the substring will be extracted. |
---|---|
$start |
Required. The starting position for extraction.
The index starts at 0 , meaning the first character of the string is at index 0 . If a negative value is used, it counts backward from the end of the string. For example, -1 refers to the last character, and -2 refers to the second to last character. |
$length |
Optional. The length of the substring to extract.
The default value is null , which means extracting all characters from the start position to the end of the string. |
Return Values
The substr()
function returns the extracted substring when successful.
Changelog
Version | Description |
---|---|
8.0.0 | The $length parameter can now explicitly accept a null value. When set to null , the function returns the substring from the start position to the end of the string, whereas previously it returned an empty string. |
8.0.0 | The function now returns an empty string in cases where it previously returned false . |
Practical Examples
The following examples demonstrate how the substr()
function behaves in various situations—including standard usage with ASCII strings and more complex cases involving multibyte characters.
Basic Usage
$originalString = 'Hello, world!';
$start = 7; // Start position for extraction
$length = 5; // Number of characters to extract
$extractedString = substr($originalString, $start, $length);
if ($extractedString !== false && $extractedString !== '') {
echo 'Extracted string: ' . $extractedString;
} else {
echo 'Failed to extract substring';
}
// Output: Extracted string: world
In this example, the substring 'world'
is extracted starting from index 7
with a length of 5
from the original string 'Hello, world!'
, and then printed.
Using Negative Values for the $start
Parameter: Counting from the End of the String
When a negative value is passed to the $start
parameter, the substr()
function begins counting from the end of the string instead of the beginning. For example, -1
refers to the last character, and -2
refers to the second-to-last character.
Here's an example of using a negative $start
value:
$originalString = 'Hello, world!';
$startNegative = -6; // Start from the 6th character from the end
$length = 5; // Number of characters to extract
$extractedString = substr($originalString, $startNegative, $length);
echo 'Extracted string: ' . $extractedString;
// Output: Extracted string: world
In this example, the value -6
causes the function to start from the 6th character from the end of the string 'Hello, world!'
, resulting in the substring 'world'
.
Below are additional examples demonstrating how substr()
handles negative values:
// Example 1:
// Extract from index 0 up to -1 (i.e., one character before the end).
// -1 refers to the last character, so this omits the final character.
echo substr('Hello world', 0, -1);
// Output: Hello worl
// Example 2:
// Start from -9 and extract up to -3.
// -9 starts 9 characters from the end, and -3 ends 3 characters from the end.
// Resulting substring: 'llo wo'
echo substr('Hello world', -9, -3);
// Output: llo wo
// Example 3:
// Start from index 0 and extract up to -4.
// -4 refers to the fourth character from the end (i.e., index 7).
// Resulting substring: 'Hello w'
echo substr('Hello world', 0, -4);
// Output: Hello w
Using negative values with the $start
or $length
parameters can be tricky, especially when you're trying to extract portions of a string relative to its end. Refer to the examples above carefully to avoid confusion.
Handling Multibyte Strings with substr()
and mb_substr()
When working with strings containing multibyte characters—such as Japanese, Chinese, Korean, or emoji—using the substr()
function can lead to unexpected results. This is because substr()
operates on byte offsets rather than character counts.
Note:
In UTF-8 encoding, characters like English letters and digits typically occupy 1 byte, but multibyte characters can use 2 or more bytes. Cutting a string at arbitrary byte positions may split a character in half, causing corrupted or garbled output.
For example, consider the Japanese string 'こんにちは世界'
("Hello, World" in Japanese):
echo substr('こんにちは世界', 0, 5);
// Output might be garbled due to cutting within multibyte characters
The above code may output broken characters because substr()
cuts based on bytes, not characters.
To correctly handle multibyte strings, PHP provides the mb_substr()
function, which works at the character level and respects multibyte encodings.
Using mb_substr()
, the same operation works as expected:
echo mb_substr('こんにちは世界', 0, 5);
This correctly extracts the first five characters without breaking any multibyte characters.
Important Note
To avoid unexpected or corrupted output, always use mb_substr()
instead of substr()
when working with multibyte strings. This is especially important when your application supports multilingual content, user input in non-Latin scripts, or includes emoji.
Even if your current strings are mostly in English, using mb_substr()
from the beginning ensures compatibility and correctness across diverse languages and character sets. (Note: mb_substr()
is slightly slower than substr()
, but the improved reliability outweighs the performance cost in most cases.)