Validating Strings as Hiragana Using Regex in PHP

This article explains how to use regular expressions in PHP to check if a string consists only of Hiragana characters.

Basic Pattern

If a string consists of one or more Hiragana characters, you can use the following regular expression pattern:

/^[ぁ-ん]+$/u

In this pattern, "^" represents the start of the string, "$" represents the end of the string, and "[ぁ-ん]" specifies the range for Hiragana characters. The + quantifier indicates that the pattern matches one or more occurrences. The final "u" enables UTF-8 mode for proper recognition of multibyte characters like Japanese.

Special Cases: Iterative Characters and Diacritics

If your string contains special Hiragana characters like "ころ" or "みす", you can use this pattern:

/^[ぁ-ゞ]+$/u

However, note that this pattern may also match isolated diacritical marks like (voicing mark) or (semi-voicing mark).

Unicode Property Escape

If your string contains long vowels or unique Hiragana characters such as "けき", "", "", "", or "", you can use a Unicode property escape pattern:

/^[\p{Hiragana}]+$/u

Additional Notes

This article assumes UTF-8 encoding.

Note that the way modifiers are specified might differ depending on the regular expression library you are using. Refer to the library's documentation for specific details.

Below is a summary of commonly used regular expression patterns in PHP. Use these depending on your requirements:

#Matching ConditionRegular Expression Pattern
1Contains only hiragana^[ぁ-ん]+$
^[\p{Hiragana}]+$
2Fixed length of n hiragana^[ぁ-ん]{n}$
^[\p{Hiragana}]{n}$
3At least n hiragana^[ぁ-ん]{n,}$
^[\p{Hiragana}]{n,}$
4No more than m hiragana^[ぁ-ん]{1,m}$
^[\p{Hiragana}]{1,m}$
5Between n and m hiragana^[ぁ-ん]{n,m}$
^[\p{Hiragana}]{1,m}$
A list of common regular expression patterns for validating hiragana

Source Code

Next, we introduce a PHP function to determine if an input string consists only of Hiragana characters. This function uses Unicode property escapes for flexibility and future-proofing. It can validate based on the following conditions:

  • When minimum length is omitted: Verifies the string consists only of Hiragana characters within the specified maximum length.
  • When maximum length is omitted: Ensures the string has only Hiragana characters and meets the minimum length requirement.
  • When both minimum and maximum lengths are omitted: Verifies that the entire string contains only Hiragana characters.
/**
 * Checks whether a given string consists only of Hiragana characters.
 *
 * @param string $str The input string.
 * @param ?int $minLength The minimum number of characters (treated as 1 if null).
 * @param ?int $maxLength The maximum number of characters (treated as the length of the string if null).
 * @return bool Returns true if the input string meets the conditions; otherwise, returns false.
 * @throws InvalidArgumentException Throws an exception if the minimum length is less than 1,
 * or if the maximum length is less than the minimum length.
 */
function isHiragana(string $str, ?int $minLength = null, ?int $maxLength = null): bool {
    // Set default values
    $min = $minLength ?? 1;
    $max = $maxLength ?? mb_strlen($str);

    // Validate arguments
    if ($min < 1) {
        // Throw an exception if the minimum length is less than 1
        throw new InvalidArgumentException('The minimum length must be an integer greater than or equal to 1.');
    }

    if (!is_null($maxLength) && $max < $min) {
        // Throw an exception if the maximum length is less than the minimum length
        throw new InvalidArgumentException('The minimum length must not exceed the maximum length.');
    }

    // Construct the regular expression pattern for Hiragana characters
    $pattern = is_null($maxLength)
        ? sprintf('/^[\p{Hiragana}]{%d,}$/u', (int)$min) // No limit on the maximum number of characters
        : sprintf('/^[\p{Hiragana}]{%d,%d}$/u', (int)$min, (int)$max);

    // Perform the validation using regular expressions
    return (bool)preg_match($pattern, $str);
}

Validation

Specify the range in terms of characters (not bytes).

Input Parameters

isHiragana("
",
,
);

Test Results

Follow me!

photo by:Thought Catalog