Validating Strings as Half width Katakana Using Regex in PHP
To check if a string consists of half-width Katakana characters using regular expressions, use the pattern ^[ヲ-゚]+$
. This regex means "a string composed of one or more half-width Katakana characters." It will match strings like "パターン" (pattern) and "ヴィーナス" (Venus), which include chōonpu (long sound marks), dakuten (voicing marks), and handakuten (semi-voicing marks).
For this pattern to work correctly in PHP, you need to specify the PCRE_UTF8 modifier "u" to indicate that the character encoding is UTF-8, like this: /^[ヲ-゚]+$/u
.
Below are common regular expression patterns for basic half-width Katakana validation:
# | Match Condition | Regular Expression Pattern |
---|---|---|
1 | All Half-width Katakana | ^[ヲ-゚]+$ |
2 | Fixed-length n-digit Half-width Katakana | ^[ヲ-゚]{n}$ |
3 | At least n-digit Half-width Katakana | ^[ヲ-゚]{n,}$ |
4 | At most m-digit Half-width Katakana | ^[ヲ-゚]{1,m}$ |
5 | Between n and m-digit Half-width Katakana | ^[ヲ-゚]{n,m}$ |
Source Code
Next, we introduce a PHP function that determines if an input string consists solely of half-width Katakana characters. This function can validate strings based on the following conditions:
- If `minLength` is omitted: Checks if the string consists only of half-width Katakana characters and does not exceed the specified maximum length.
- If `maxLength` is omitted: Checks if the string consists only of half-width Katakana characters and meets or exceeds the specified minimum length.
- If both `minLength` and `maxLength` are omitted: Checks if the entire string consists solely of half-width Katakana characters.
/**
* Checks if a string consists only of half-width Katakana characters.
*
* This function verifies if the string is composed entirely of half-width Katakana
* characters and falls within the specified length range.
*
* @param string $str The input string.
* @param ?int $minLength Minimum length (defaults to 1 if null).
* @param ?int $maxLength Maximum length (defaults to the string's length if null).
* @return bool Returns true if the input string meets the conditions, false otherwise.
* @throws InvalidArgumentException Throws an exception if $minLength is less than 1, or if $maxLength is less than $minLength.
*/
function isHalfWidthKatakana(string $str, ?int $minLength = null, ?int $maxLength = null): bool {
// Set default values
$min = $minLength ?? 1;
$max = $maxLength ?? mb_strlen($str);
// Validate arguments
if ($min < 1) {
// Throw an exception if minLength is less than 1
throw new InvalidArgumentException('Minimum length must be an integer of 1 or more.');
}
if (!is_null($maxLength) && $max < $min) {
// Throw an exception if maxLength is less than minLength
throw new InvalidArgumentException('Minimum length must be less than or equal to maximum length.');
}
// Construct the half-width Katakana regex pattern
$pattern = is_null($maxLength)
? sprintf('/^[ヲ-゚]{%d,}$/u', (int)$min) // If maxLength is unlimited
: sprintf('/^[ヲ-゚]{%d,%d}$/u', (int)$min, (int)$max);
// Perform the regex check
return (bool)preg_match($pattern, $str);
}
Verification
Please specify the range in character count (number of characters), not bytes.