ICU 54.1  54.1
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Data Structures | Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Friends
icu::UnicodeString Class Reference

UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer classes. More...

#include <unistr.h>

Inheritance diagram for icu::UnicodeString:
icu::Replaceable icu::UObject icu::UMemory

Data Structures

union  StackBufferOrFields

Public Types

enum  EInvariant { kInvariant }
 Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string. More...

Public Member Functions

UBool operator== (const UnicodeString &text) const
 Equality operator.
UBool operator!= (const UnicodeString &text) const
 Inequality operator.
UBool operator> (const UnicodeString &text) const
 Greater than operator.
UBool operator< (const UnicodeString &text) const
 Less than operator.
UBool operator>= (const UnicodeString &text) const
 Greater than or equal operator.
UBool operator<= (const UnicodeString &text) const
 Less than or equal operator.
int8_t compare (const UnicodeString &text) const
 Compare the characters bitwise in this UnicodeString to the characters in text.
int8_t compare (int32_t start, int32_t length, const UnicodeString &text) const
 Compare the characters bitwise in the range [start, start + length) with the characters in the entire string text.
int8_t compare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Compare the characters bitwise in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).
int8_t compare (const UChar *srcChars, int32_t srcLength) const
 Compare the characters bitwise in this UnicodeString with the first srcLength characters in srcChars.
int8_t compare (int32_t start, int32_t length, const UChar *srcChars) const
 Compare the characters bitwise in the range [start, start + length) with the first length characters in srcChars
int8_t compare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Compare the characters bitwise in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).
int8_t compareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
 Compare the characters bitwise in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).
int8_t compareCodePointOrder (const UnicodeString &text) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (const UChar *srcChars, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrderBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
 Compare two Unicode strings in code point order.
int8_t caseCompare (const UnicodeString &text, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (const UChar *srcChars, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
UBool startsWith (const UnicodeString &text) const
 Determine if this starts with the characters in text
UBool startsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Determine if this starts with the characters in srcText in the range [srcStart, srcStart + srcLength).
UBool startsWith (const UChar *srcChars, int32_t srcLength) const
 Determine if this starts with the characters in srcChars
UBool startsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).
UBool endsWith (const UnicodeString &text) const
 Determine if this ends with the characters in text
UBool endsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcText in the range [srcStart, srcStart + srcLength).
UBool endsWith (const UChar *srcChars, int32_t srcLength) const
 Determine if this ends with the characters in srcChars
UBool endsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).
int32_t indexOf (const UnicodeString &text) const
 Locate in this the first occurrence of the characters in text, using bitwise comparison.
int32_t indexOf (const UnicodeString &text, int32_t start) const
 Locate in this the first occurrence of the characters in text starting at offset start, using bitwise comparison.
int32_t indexOf (const UnicodeString &text, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.
int32_t indexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
 Locate in this the first occurrence of the characters in srcChars starting at offset start, using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t indexOf (UChar c) const
 Locate in this the first occurrence of the BMP code point c, using bitwise comparison.
int32_t indexOf (UChar32 c) const
 Locate in this the first occurrence of the code point c, using bitwise comparison.
int32_t indexOf (UChar c, int32_t start) const
 Locate in this the first occurrence of the BMP code point c, starting at offset start, using bitwise comparison.
int32_t indexOf (UChar32 c, int32_t start) const
 Locate in this the first occurrence of the code point c starting at offset start, using bitwise comparison.
int32_t indexOf (UChar c, int32_t start, int32_t length) const
 Locate in this the first occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.
int32_t indexOf (UChar32 c, int32_t start, int32_t length) const
 Locate in this the first occurrence of the code point c in the range [start, start + length), using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text) const
 Locate in this the last occurrence of the characters in text, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text, int32_t start) const
 Locate in this the last occurrence of the characters in text starting at offset start, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
 Locate in this the last occurrence of the characters in srcChars starting at offset start, using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t lastIndexOf (UChar c) const
 Locate in this the last occurrence of the BMP code point c, using bitwise comparison.
int32_t lastIndexOf (UChar32 c) const
 Locate in this the last occurrence of the code point c, using bitwise comparison.
int32_t lastIndexOf (UChar c, int32_t start) const
 Locate in this the last occurrence of the BMP code point c starting at offset start, using bitwise comparison.
int32_t lastIndexOf (UChar32 c, int32_t start) const
 Locate in this the last occurrence of the code point c starting at offset start, using bitwise comparison.
int32_t lastIndexOf (UChar c, int32_t start, int32_t length) const
 Locate in this the last occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.
int32_t lastIndexOf (UChar32 c, int32_t start, int32_t length) const
 Locate in this the last occurrence of the code point c in the range [start, start + length), using bitwise comparison.
UChar charAt (int32_t offset) const
 Return the code unit at offset offset.
UChar operator[] (int32_t offset) const
 Return the code unit at offset offset.
UChar32 char32At (int32_t offset) const
 Return the code point that contains the code unit at offset offset.
int32_t getChar32Start (int32_t offset) const
 Adjust a random-access offset so that it points to the beginning of a Unicode character.
int32_t getChar32Limit (int32_t offset) const
 Adjust a random-access offset so that it points behind a Unicode character.
int32_t moveIndex32 (int32_t index, int32_t delta) const
 Move the code unit index along the string by delta code points.
void extract (int32_t start, int32_t length, UChar *dst, int32_t dstStart=0) const
 Copy the characters in the range [start, start + length) into the array dst, beginning at dstStart.
int32_t extract (UChar *dest, int32_t destCapacity, UErrorCode &errorCode) const
 Copy the contents of the string into dest.
void extract (int32_t start, int32_t length, UnicodeString &target) const
 Copy the characters in the range [start, start + length) into the UnicodeString target.
void extractBetween (int32_t start, int32_t limit, UChar *dst, int32_t dstStart=0) const
 Copy the characters in the range [start, limit) into the array dst, beginning at dstStart.
virtual void extractBetween (int32_t start, int32_t limit, UnicodeString &target) const
 Copy the characters in the range [start, limit) into the UnicodeString target.
int32_t extract (int32_t start, int32_t startLength, char *target, int32_t targetCapacity, enum EInvariant inv) const
 Copy the characters in the range [start, start + length) into an array of characters.
int32_t extract (int32_t start, int32_t startLength, char *target, uint32_t targetLength) const
 Copy the characters in the range [start, start + length) into an array of characters in the platform's default codepage.
int32_t extract (int32_t start, int32_t startLength, char *target, const char *codepage=0) const
 Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.
int32_t extract (int32_t start, int32_t startLength, char *target, uint32_t targetLength, const char *codepage) const
 Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.
int32_t extract (char *dest, int32_t destCapacity, UConverter *cnv, UErrorCode &errorCode) const
 Convert the UnicodeString into a codepage string using an existing UConverter.
UnicodeString tempSubString (int32_t start=0, int32_t length=INT32_MAX) const
 Create a temporary substring for the specified range.
UnicodeString tempSubStringBetween (int32_t start, int32_t limit=INT32_MAX) const
 Create a temporary substring for the specified range.
void toUTF8 (ByteSink &sink) const
 Convert the UnicodeString to UTF-8 and write the result to a ByteSink.
template<typename StringClass >
StringClass & toUTF8String (StringClass &result) const
 Convert the UnicodeString to UTF-8 and append the result to a standard string.
int32_t toUTF32 (UChar32 *utf32, int32_t capacity, UErrorCode &errorCode) const
 Convert the UnicodeString to UTF-32.
int32_t length (void) const
 Return the length of the UnicodeString object.
int32_t countChar32 (int32_t start=0, int32_t length=INT32_MAX) const
 Count Unicode code points in the length UChar code units of the string.
UBool hasMoreChar32Than (int32_t start, int32_t length, int32_t number) const
 Check if the length UChar code units of the string contain more Unicode code points than a certain number.
UBool isEmpty (void) const
 Determine if this string is empty.
int32_t getCapacity (void) const
 Return the capacity of the internal buffer of the UnicodeString object.
int32_t hashCode (void) const
 Generate a hash code for this object.
UBool isBogus (void) const
 Determine if this object contains a valid string.
UnicodeStringoperator= (const UnicodeString &srcText)
 Assignment operator.
UnicodeStringfastCopyFrom (const UnicodeString &src)
 Almost the same as the assignment operator.
UnicodeStringoperator= (UChar ch)
 Assignment operator.
UnicodeStringoperator= (UChar32 ch)
 Assignment operator.
UnicodeStringsetTo (const UnicodeString &srcText, int32_t srcStart)
 Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcText.length()).
UnicodeStringsetTo (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcStart + srcLength).
UnicodeStringsetTo (const UnicodeString &srcText)
 Set the text in the UnicodeString object to the characters in srcText.
UnicodeStringsetTo (const UChar *srcChars, int32_t srcLength)
 Set the characters in the UnicodeString object to the characters in srcChars.
UnicodeStringsetTo (UChar srcChar)
 Set the characters in the UnicodeString object to the code unit srcChar.
UnicodeStringsetTo (UChar32 srcChar)
 Set the characters in the UnicodeString object to the code point srcChar.
UnicodeStringsetTo (UBool isTerminated, const UChar *text, int32_t textLength)
 Aliasing setTo() function, analogous to the readonly-aliasing UChar* constructor.
UnicodeStringsetTo (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
 Aliasing setTo() function, analogous to the writable-aliasing UChar* constructor.
void setToBogus ()
 Make this UnicodeString object invalid.
UnicodeStringsetCharAt (int32_t offset, UChar ch)
 Set the character at the specified offset to the specified character.
UnicodeStringoperator+= (UChar ch)
 Append operator.
UnicodeStringoperator+= (UChar32 ch)
 Append operator.
UnicodeStringoperator+= (const UnicodeString &srcText)
 Append operator.
UnicodeStringappend (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Append the characters in srcText in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.
UnicodeStringappend (const UnicodeString &srcText)
 Append the characters in srcText to the UnicodeString object.
UnicodeStringappend (const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Append the characters in srcChars in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.
UnicodeStringappend (const UChar *srcChars, int32_t srcLength)
 Append the characters in srcChars to the UnicodeString object at offset start.
UnicodeStringappend (UChar srcChar)
 Append the code unit srcChar to the UnicodeString object.
UnicodeStringappend (UChar32 srcChar)
 Append the code point srcChar to the UnicodeString object.
UnicodeStringinsert (int32_t start, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Insert the characters in srcText in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UnicodeString &srcText)
 Insert the characters in srcText into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Insert the characters in srcChars in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcLength)
 Insert the characters in srcChars into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, UChar srcChar)
 Insert the code unit srcChar into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, UChar32 srcChar)
 Insert the code point srcChar into the UnicodeString object at offset start.
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText)
 Replace the characters in the range [start, start + length) with the characters in srcText.
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcChars.
UnicodeStringreplace (int32_t start, int32_t length, UChar srcChar)
 Replace the characters in the range [start, start + length) with the code unit srcChar.
UnicodeStringreplace (int32_t start, int32_t length, UChar32 srcChar)
 Replace the characters in the range [start, start + length) with the code point srcChar.
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText)
 Replace the characters in the range [start, limit) with the characters in srcText.
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit)
 Replace the characters in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).
virtual void handleReplaceBetween (int32_t start, int32_t limit, const UnicodeString &text)
 Replace a substring of this object with the given text.
virtual UBool hasMetaData () const
 Replaceable API.
virtual void copy (int32_t start, int32_t limit, int32_t dest)
 Copy a substring of this object, retaining attribute (out-of-band) information.
UnicodeStringfindAndReplace (const UnicodeString &oldText, const UnicodeString &newText)
 Replace all occurrences of characters in oldText with the characters in newText.
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, const UnicodeString &newText)
 Replace all occurrences of characters in oldText with characters in newText in the range [start, start + length).
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, int32_t oldStart, int32_t oldLength, const UnicodeString &newText, int32_t newStart, int32_t newLength)
 Replace all occurrences of characters in oldText in the range [oldStart, oldStart + oldLength) with the characters in newText in the range [newStart, newStart + newLength) in the range [start, start + length).
UnicodeStringremove (void)
 Remove all characters from the UnicodeString object.
UnicodeStringremove (int32_t start, int32_t length=(int32_t) INT32_MAX)
 Remove the characters in the range [start, start + length) from the UnicodeString object.
UnicodeStringremoveBetween (int32_t start, int32_t limit=(int32_t) INT32_MAX)
 Remove the characters in the range [start, limit) from the UnicodeString object.
UnicodeStringretainBetween (int32_t start, int32_t limit=INT32_MAX)
 Retain only the characters in the range [start, limit) from the UnicodeString object.
UBool padLeading (int32_t targetLength, UChar padChar=0x0020)
 Pad the start of this UnicodeString with the character padChar.
UBool padTrailing (int32_t targetLength, UChar padChar=0x0020)
 Pad the end of this UnicodeString with the character padChar.
UBool truncate (int32_t targetLength)
 Truncate this UnicodeString to the targetLength.
UnicodeStringtrim (void)
 Trims leading and trailing whitespace from this UnicodeString.
UnicodeStringreverse (void)
 Reverse this UnicodeString in place.
UnicodeStringreverse (int32_t start, int32_t length)
 Reverse the range [start, start + length) in this UnicodeString.
UnicodeStringtoUpper (void)
 Convert the characters in this to UPPER CASE following the conventions of the default locale.
UnicodeStringtoUpper (const Locale &locale)
 Convert the characters in this to UPPER CASE following the conventions of a specific locale.
UnicodeStringtoLower (void)
 Convert the characters in this to lower case following the conventions of the default locale.
UnicodeStringtoLower (const Locale &locale)
 Convert the characters in this to lower case following the conventions of a specific locale.
UnicodeStringtoTitle (BreakIterator *titleIter)
 Titlecase this string, convenience function using the default locale.
UnicodeStringtoTitle (BreakIterator *titleIter, const Locale &locale)
 Titlecase this string.
UnicodeStringtoTitle (BreakIterator *titleIter, const Locale &locale, uint32_t options)
 Titlecase this string, with options.
UnicodeStringfoldCase (uint32_t options=0)
 Case-folds the characters in this string.
UChargetBuffer (int32_t minCapacity)
 Get a read/write pointer to the internal buffer.
void releaseBuffer (int32_t newLength=-1)
 Release a read/write buffer on a UnicodeString object with an "open" getBuffer(minCapacity).
const UChargetBuffer () const
 Get a read-only pointer to the internal buffer.
const UChargetTerminatedBuffer ()
 Get a read-only pointer to the internal buffer, making sure that it is NUL-terminated.
 UnicodeString ()
 Construct an empty UnicodeString.
 UnicodeString (int32_t capacity, UChar32 c, int32_t count)
 Construct a UnicodeString with capacity to hold capacity UChars.
UNISTR_FROM_CHAR_EXPLICIT UnicodeString (UChar ch)
 Single UChar (code unit) constructor.
UNISTR_FROM_CHAR_EXPLICIT UnicodeString (UChar32 ch)
 Single UChar32 (code point) constructor.
UNISTR_FROM_STRING_EXPLICIT UnicodeString (const UChar *text)
 UChar* constructor.
 UnicodeString (const UChar *text, int32_t textLength)
 UChar* constructor.
 UnicodeString (UBool isTerminated, const UChar *text, int32_t textLength)
 Readonly-aliasing UChar* constructor.
 UnicodeString (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
 Writable-aliasing UChar* constructor.
UNISTR_FROM_STRING_EXPLICIT UnicodeString (const char *codepageData)
 char* constructor.
 UnicodeString (const char *codepageData, int32_t dataLength)
 char* constructor.
 UnicodeString (const char *codepageData, const char *codepage)
 char* constructor.
 UnicodeString (const char *codepageData, int32_t dataLength, const char *codepage)
 char* constructor.
 UnicodeString (const char *src, int32_t srcLength, UConverter *cnv, UErrorCode &errorCode)
 char * / UConverter constructor.
 UnicodeString (const char *src, int32_t length, enum EInvariant inv)
 Constructs a Unicode string from an invariant-character char * string.
 UnicodeString (const UnicodeString &that)
 Copy constructor.
 UnicodeString (const UnicodeString &src, int32_t srcStart)
 'Substring' constructor from tail of source string.
 UnicodeString (const UnicodeString &src, int32_t srcStart, int32_t srcLength)
 'Substring' constructor from subrange of source string.
virtual Replaceableclone () const
 Clone this object, an instance of a subclass of Replaceable.
virtual ~UnicodeString ()
 Destructor.
UnicodeString unescape () const
 Unescape a string of characters and return a string containing the result.
UChar32 unescapeAt (int32_t &offset) const
 Unescape a single escape sequence and return the represented character.
virtual UClassID getDynamicClassID () const
 ICU "poor man's RTTI", returns a UClassID for the actual class.
- Public Member Functions inherited from icu::Replaceable
virtual ~Replaceable ()
 Destructor.
- Public Member Functions inherited from icu::UObject
virtual ~UObject ()
 Destructor.

Static Public Member Functions

static UnicodeString fromUTF8 (const StringPiece &utf8)
 Create a UnicodeString from a UTF-8 string.
static UnicodeString fromUTF32 (const UChar32 *utf32, int32_t length)
 Create a UnicodeString from a UTF-32 string.
static UClassID getStaticClassID ()
 ICU "poor man's RTTI", returns a UClassID for this class.

Protected Member Functions

virtual int32_t getLength () const
 Implement Replaceable::getLength() (see jitterbug 1027).
virtual UChar getCharAt (int32_t offset) const
 The change in Replaceable to use virtual getCharAt() allows UnicodeString::charAt() to be inline again (see jitterbug 709).
virtual UChar32 getChar32At (int32_t offset) const
 The change in Replaceable to use virtual getChar32At() allows UnicodeString::char32At() to be inline again (see jitterbug 709).
- Protected Member Functions inherited from icu::Replaceable
 Replaceable ()
 Default constructor.

Friends

class StringThreadTest
class UnicodeStringAppendable
union StackBufferOrFields

Detailed Description

UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer classes.

It is a concrete implementation of the abstract class Replaceable (for transliteration).

The UnicodeString class is not suitable for subclassing.

For an overview of Unicode strings in C and C++ see the User Guide Strings chapter.

In ICU, a Unicode string consists of 16-bit Unicode code units. A Unicode character may be stored with either one code unit (the most common case) or with a matched pair of special code units ("surrogates"). The data type for code units is UChar. For single-character handling, a Unicode character code point is a value in the range 0..0x10ffff. ICU uses the UChar32 type for code points.

Indexes and offsets into and lengths of strings always count code units, not code points. This is the same as with multi-byte char* strings in traditional string handling. Operations on partial strings typically do not test for code point boundaries. If necessary, the user needs to take care of such boundaries by testing for the code unit values or by using functions like UnicodeString::getChar32Start() and UnicodeString::getChar32Limit() (or, in C, the equivalent macros U16_SET_CP_START() and U16_SET_CP_LIMIT(), see utf.h).

UnicodeString methods are more lenient with regard to input parameter values than other ICU APIs. In particular:

In string comparisons, two UnicodeString objects that are both "bogus" compare equal (to be transitive and prevent endless loops in sorting), and a "bogus" string compares less than any non-"bogus" one.

Const UnicodeString methods are thread-safe. Multiple threads can use const methods on the same UnicodeString object simultaneously, but non-const methods must not be called concurrently (in multiple threads) with any other (const or non-const) methods.

Similarly, const UnicodeString & parameters are thread-safe. One object may be passed in as such a parameter concurrently in multiple threads. This includes the const UnicodeString & parameters for copy construction, assignment, and cloning.

UnicodeString uses several storage methods. String contents can be stored inside the UnicodeString object itself, in an allocated and shared buffer, or in an outside buffer that is "aliased". Most of this is done transparently, but careful aliasing in particular provides significant performance improvements. Also, the internal buffer is accessible via special functions. For details see the User Guide Strings chapter.

See Also
utf.h
CharacterIterator
Stable:
ICU 2.0

Definition at line 245 of file unistr.h.

Member Enumeration Documentation

Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string.

Use the macro US_INV instead of the full qualification for this value.

See Also
US_INV
Stable:
ICU 3.2
Enumerator:
kInvariant 
See Also
EInvariant
Stable:
ICU 3.2

Definition at line 257 of file unistr.h.

Constructor & Destructor Documentation

icu::UnicodeString::UnicodeString ( )
inline

Construct an empty UnicodeString.

Stable:
ICU 2.0

Definition at line 3611 of file unistr.h.

icu::UnicodeString::UnicodeString ( int32_t  capacity,
UChar32  c,
int32_t  count 
)

Construct a UnicodeString with capacity to hold capacity UChars.

Parameters
capacitythe number of UChars this UnicodeString should hold before a resize is necessary; if count is greater than 0 and count code points c take up more space than capacity, then capacity is adjusted accordingly.
cis used to initially fill the string
countspecifies how many code points c are to be written in the string
Stable:
ICU 2.0
UNISTR_FROM_CHAR_EXPLICIT icu::UnicodeString::UnicodeString ( UChar  ch)

Single UChar (code unit) constructor.

It is recommended to mark this constructor "explicit" by -DUNISTR_FROM_CHAR_EXPLICIT=explicit on the compiler command line or similar.

Parameters
chthe character to place in the UnicodeString
Stable:
ICU 2.0
UNISTR_FROM_CHAR_EXPLICIT icu::UnicodeString::UnicodeString ( UChar32  ch)

Single UChar32 (code point) constructor.

It is recommended to mark this constructor "explicit" by -DUNISTR_FROM_CHAR_EXPLICIT=explicit on the compiler command line or similar.

Parameters
chthe character to place in the UnicodeString
Stable:
ICU 2.0
UNISTR_FROM_STRING_EXPLICIT icu::UnicodeString::UnicodeString ( const UChar text)

UChar* constructor.

It is recommended to mark this constructor "explicit" by -DUNISTR_FROM_STRING_EXPLICIT=explicit on the compiler command line or similar.

Parameters
textThe characters to place in the UnicodeString. text must be NULL (U+0000) terminated.
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const UChar text,
int32_t  textLength 
)

UChar* constructor.

Parameters
textThe characters to place in the UnicodeString.
textLengthThe number of Unicode characters in text to copy.
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( UBool  isTerminated,
const UChar text,
int32_t  textLength 
)

Readonly-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has copy-on-write semantics: When the string is modified, then the buffer is first copied into newly allocated memory. The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor or the assignment operator, the text will be copied. When using fastCopyFrom(), the text will be aliased again, so that both strings then alias the same readonly-text.

Parameters
isTerminatedspecifies if text is NUL-terminated. This must be true if textLength==-1.
textThe characters to alias for the UnicodeString.
textLengthThe number of Unicode characters in text to alias. If -1, then this constructor will determine the length by calling u_strlen().
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( UChar buffer,
int32_t  buffLength,
int32_t  buffCapacity 
)

Writable-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has write-through semantics: For as long as the capacity of the buffer is sufficient, write operations will directly affect the buffer. When more capacity is necessary, then a new buffer will be allocated and the contents copied as with regularly constructed strings. In an assignment to another UnicodeString, the buffer will be copied. The extract(UChar *dst) function detects whether the dst pointer is the same as the string buffer itself and will in this case not copy the contents.

Parameters
bufferThe characters to alias for the UnicodeString.
buffLengthThe number of Unicode characters in buffer to alias.
buffCapacityThe size of buffer in UChars.
Stable:
ICU 2.0
UNISTR_FROM_STRING_EXPLICIT icu::UnicodeString::UnicodeString ( const char *  codepageData)

char* constructor.

Uses the default converter (and thus depends on the ICU conversion code) unless U_CHARSET_IS_UTF8 is set to 1.

For ASCII (really "invariant character") strings it is more efficient to use the constructor that takes a US_INV (for its enum EInvariant). For ASCII (invariant-character) string literals, see UNICODE_STRING and UNICODE_STRING_SIMPLE.

It is recommended to mark this constructor "explicit" by -DUNISTR_FROM_STRING_EXPLICIT=explicit on the compiler command line or similar.

Parameters
codepageDataan array of bytes, null-terminated, in the platform's default codepage.
Stable:
ICU 2.0
See Also
UNICODE_STRING
UNICODE_STRING_SIMPLE
icu::UnicodeString::UnicodeString ( const char *  codepageData,
int32_t  dataLength 
)

char* constructor.

Uses the default converter (and thus depends on the ICU conversion code) unless U_CHARSET_IS_UTF8 is set to 1.

Parameters
codepageDataan array of bytes in the platform's default codepage.
dataLengthThe number of bytes in codepageData.
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const char *  codepageData,
const char *  codepage 
)

char* constructor.

Parameters
codepageDataan array of bytes, null-terminated
codepagethe encoding of codepageData. The special value 0 for codepage indicates that the text is in the platform's default codepage.

If codepage is an empty string (""), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. Recommendation: For invariant-character strings use the constructor UnicodeString(const char *src, int32_t length, enum EInvariant inv) because it avoids object code dependencies of UnicodeString on the conversion code.

Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const char *  codepageData,
int32_t  dataLength,
const char *  codepage 
)

char* constructor.

Parameters
codepageDataan array of bytes.
dataLengthThe number of bytes in codepageData.
codepagethe encoding of codepageData. The special value 0 for codepage indicates that the text is in the platform's default codepage. If codepage is an empty string (""), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. Recommendation: For invariant-character strings use the constructor UnicodeString(const char *src, int32_t length, enum EInvariant inv) because it avoids object code dependencies of UnicodeString on the conversion code.
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const char *  src,
int32_t  srcLength,
UConverter cnv,
UErrorCode errorCode 
)

char * / UConverter constructor.

This constructor uses an existing UConverter object to convert the codepage string to Unicode and construct a UnicodeString from that.

The converter is reset at first. If the error code indicates a failure before this constructor is called, or if an error occurs during conversion or construction, then the string will be bogus.

This function avoids the overhead of opening and closing a converter if multiple strings are constructed.

Parameters
srcinput codepage string
srcLengthlength of the input string, can be -1 for NUL-terminated strings
cnvconverter object (ucnv_resetToUnicode() will be called), can be NULL for the default converter
errorCodenormal ICU error code
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const char *  src,
int32_t  length,
enum EInvariant  inv 
)

Constructs a Unicode string from an invariant-character char * string.

About invariant characters see utypes.h. This constructor has no runtime dependency on conversion code and is therefore recommended over ones taking a charset name string (where the empty string "" indicates invariant-character conversion).

Use the macro US_INV as the third, signature-distinguishing parameter.

For example:

void fn(const char *s) {
UnicodeString ustr(s, -1, US_INV);
// use ustr ...
}
Parameters
srcString using only invariant characters.
lengthLength of src, or -1 if NUL-terminated.
invSignature-distinguishing paramater, use US_INV.
See Also
US_INV
Stable:
ICU 3.2
icu::UnicodeString::UnicodeString ( const UnicodeString that)

Copy constructor.

Parameters
thatThe UnicodeString object to copy.
Stable:
ICU 2.0
icu::UnicodeString::UnicodeString ( const UnicodeString src,
int32_t  srcStart 
)

'Substring' constructor from tail of source string.

Parameters
srcThe UnicodeString object to copy.
srcStartThe offset into src at which to start copying.
Stable:
ICU 2.2
icu::UnicodeString::UnicodeString ( const UnicodeString src,
int32_t  srcStart,
int32_t  srcLength 
)

'Substring' constructor from subrange of source string.

Parameters
srcThe UnicodeString object to copy.
srcStartThe offset into src at which to start copying.
srcLengthThe number of characters from src to copy.
Stable:
ICU 2.2
virtual icu::UnicodeString::~UnicodeString ( )
virtual

Destructor.

Stable:
ICU 2.0

Member Function Documentation

UnicodeString & icu::UnicodeString::append ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Append the characters in srcText in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.

srcText is not modified.

Parameters
srcTextthe source for the new characters
srcStartthe offset into srcText where new characters will be obtained
srcLengththe number of characters in srcText in the append string
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4337 of file unistr.h.

References length().

Referenced by operator+=(), and icu::Transliterator::setID().

UnicodeString & icu::UnicodeString::append ( const UnicodeString srcText)
inline

Append the characters in srcText to the UnicodeString object.

srcText is not modified.

Parameters
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4343 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::append ( const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Append the characters in srcChars in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.

srcChars is not modified.

Parameters
srcCharsthe source for the new characters
srcStartthe offset into srcChars where new characters will be obtained
srcLengththe number of characters in srcChars in the append string; can be -1 if srcChars is NUL-terminated
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4347 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::append ( const UChar srcChars,
int32_t  srcLength 
)
inline

Append the characters in srcChars to the UnicodeString object at offset start.

srcChars is not modified.

Parameters
srcCharsthe source for the new characters
srcLengththe number of Unicode characters in srcChars; can be -1 if srcChars is NUL-terminated
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4353 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::append ( UChar  srcChar)
inline

Append the code unit srcChar to the UnicodeString object.

Parameters
srcCharthe code unit to append
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4358 of file unistr.h.

References length().

UnicodeString& icu::UnicodeString::append ( UChar32  srcChar)

Append the code point srcChar to the UnicodeString object.

Parameters
srcCharthe code point to append
Returns
a reference to this
Stable:
ICU 2.0
int8_t icu::UnicodeString::caseCompare ( const UnicodeString text,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(text.foldCase(options)).

Parameters
textAnother string to compare this one to.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3831 of file unistr.h.

References length().

int8_t icu::UnicodeString::caseCompare ( int32_t  start,
int32_t  length,
const UnicodeString srcText,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcText.foldCase(options)).

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcTextAnother string to compare this one to.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3836 of file unistr.h.

References length().

int8_t icu::UnicodeString::caseCompare ( int32_t  start,
int32_t  length,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcText.foldCase(options)).

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcTextAnother string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLengthThe number of code units from that string to compare.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3851 of file unistr.h.

int8_t icu::UnicodeString::caseCompare ( const UChar srcChars,
int32_t  srcLength,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

Parameters
srcCharsA pointer to another string to compare this one to.
srcLengthThe number of code units from that string to compare.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3844 of file unistr.h.

References length().

int8_t icu::UnicodeString::caseCompare ( int32_t  start,
int32_t  length,
const UChar srcChars,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcCharsA pointer to another string to compare this one to.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3861 of file unistr.h.

int8_t icu::UnicodeString::caseCompare ( int32_t  start,
int32_t  length,
const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcCharsA pointer to another string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLengthThe number of code units from that string to compare.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3869 of file unistr.h.

int8_t icu::UnicodeString::caseCompareBetween ( int32_t  start,
int32_t  limit,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLimit,
uint32_t  options 
) const
inline

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compareBetween(text.foldCase(options)).

Parameters
startThe start offset in this string at which the compare operation begins.
limitThe offset after the last code unit from this string to compare.
srcTextAnother string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLimitThe offset after the last code unit from that string to compare.
optionsA bit set of options:
  • U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.
  • U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details).
  • U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A negative, zero, or positive integer indicating the comparison result.
Stable:
ICU 2.0

Definition at line 3879 of file unistr.h.

UChar32 icu::UnicodeString::char32At ( int32_t  offset) const

Return the code point that contains the code unit at offset offset.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters
offseta valid offset into the text that indicates the text offset of any of the code units that will be assembled into a code point (21-bit value) and returned
Returns
the code point of text at offset or 0xffff if the offset is not valid for this string
Stable:
ICU 2.0

Reimplemented from icu::Replaceable.

Referenced by icu::DecimalFormatSymbols::setSymbol().

UChar icu::UnicodeString::charAt ( int32_t  offset) const
inline

Return the code unit at offset offset.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters
offseta valid offset into the text
Returns
the code unit at offset offset or 0xffff if the offset is not valid for this string
Stable:
ICU 2.0

Reimplemented from icu::Replaceable.

Definition at line 4244 of file unistr.h.

virtual Replaceable* icu::UnicodeString::clone ( ) const
virtual

Clone this object, an instance of a subclass of Replaceable.

Clones can be used concurrently in multiple threads. If a subclass does not implement clone(), or if an error occurs, then NULL is returned. The clone functions in all subclasses return a pointer to a Replaceable because some compilers do not support covariant (same-as-this) return types; cast to the appropriate subclass if necessary. The caller must delete the clone.

Returns
a clone of this object
See Also
Replaceable::clone
getDynamicClassID
Stable:
ICU 2.6

Reimplemented from icu::Replaceable.

int8_t icu::UnicodeString::compare ( const UnicodeString text) const
inline

Compare the characters bitwise in this UnicodeString to the characters in text.

Parameters
textThe UnicodeString to compare to this one.
Returns
The result of bitwise character comparison: 0 if this contains the same characters as text, -1 if the characters in this are bitwise less than the characters in text, +1 if the characters in this are bitwise greater than the characters in text.
Stable:
ICU 2.0

Definition at line 3708 of file unistr.h.

References length().

Referenced by startsWith().

int8_t icu::UnicodeString::compare ( int32_t  start,
int32_t  length,
const UnicodeString text 
) const
inline

Compare the characters bitwise in the range [start, start + length) with the characters in the entire string text.

(The parameters "start" and "length" are not applied to the other text "text".)

Parameters
startthe offset at which the compare operation begins
lengththe number of characters of text to compare.
textthe other text to be compared against this string.
Returns
The result of bitwise character comparison: 0 if this contains the same characters as text, -1 if the characters in this are bitwise less than the characters in text, +1 if the characters in this are bitwise greater than the characters in text.
Stable:
ICU 2.0

Definition at line 3712 of file unistr.h.

References length().

int8_t icu::UnicodeString::compare ( int32_t  start,
int32_t  length,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Compare the characters bitwise in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).

Parameters
startthe offset at which the compare operation begins
lengththe number of characters in this to compare.
srcTextthe text to be compared
srcStartthe offset into srcText to start comparison
srcLengththe number of characters in src to compare
Returns
The result of bitwise character comparison: 0 if this contains the same characters as srcText, -1 if the characters in this are bitwise less than the characters in srcText, +1 if the characters in this are bitwise greater than the characters in srcText.
Stable:
ICU 2.0

Definition at line 3723 of file unistr.h.

int8_t icu::UnicodeString::compare ( const UChar srcChars,
int32_t  srcLength 
) const
inline

Compare the characters bitwise in this UnicodeString with the first srcLength characters in srcChars.

Parameters
srcCharsThe characters to compare to this UnicodeString.
srcLengththe number of characters in srcChars to compare
Returns
The result of bitwise character comparison: 0 if this contains the same characters as srcChars, -1 if the characters in this are bitwise less than the characters in srcChars, +1 if the characters in this are bitwise greater than the characters in srcChars.
Stable:
ICU 2.0

Definition at line 3718 of file unistr.h.

References length().

int8_t icu::UnicodeString::compare ( int32_t  start,
int32_t  length,
const UChar srcChars 
) const
inline

Compare the characters bitwise in the range [start, start + length) with the first length characters in srcChars

Parameters
startthe offset at which the compare operation begins
lengththe number of characters to compare.
srcCharsthe characters to be compared
Returns
The result of bitwise character comparison: 0 if this contains the same characters as srcChars, -1 if the characters in this are bitwise less than the characters in srcChars, +1 if the characters in this are bitwise greater than the characters in srcChars.
Stable:
ICU 2.0

Definition at line 3731 of file unistr.h.

int8_t icu::UnicodeString::compare ( int32_t  start,
int32_t  length,
const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Compare the characters bitwise in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).

Parameters
startthe offset at which the compare operation begins
lengththe number of characters in this to compare
srcCharsthe characters to be compared
srcStartthe offset into srcChars to start comparison
srcLengththe number of characters in srcChars to compare
Returns
The result of bitwise character comparison: 0 if this contains the same characters as srcChars, -1 if the characters in this are bitwise less than the characters in srcChars, +1 if the characters in this are bitwise greater than the characters in srcChars.
Stable:
ICU 2.0

Definition at line 3737 of file unistr.h.

int8_t icu::UnicodeString::compareBetween ( int32_t  start,
int32_t  limit,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLimit 
) const
inline

Compare the characters bitwise in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).

Parameters
startthe offset at which the compare operation begins
limitthe offset immediately following the compare operation
srcTextthe text to be compared
srcStartthe offset into srcText to start comparison
srcLimitthe offset into srcText to limit comparison
Returns
The result of bitwise character comparison: 0 if this contains the same characters as srcText, -1 if the characters in this are bitwise less than the characters in srcText, +1 if the characters in this are bitwise greater than the characters in srcText.
Stable:
ICU 2.0

Definition at line 3745 of file unistr.h.

int8_t icu::UnicodeString::compareCodePointOrder ( const UnicodeString text) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
textAnother string to compare this one to.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3769 of file unistr.h.

References length().

int8_t icu::UnicodeString::compareCodePointOrder ( int32_t  start,
int32_t  length,
const UnicodeString srcText 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcTextAnother string to compare this one to.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3773 of file unistr.h.

References length().

int8_t icu::UnicodeString::compareCodePointOrder ( int32_t  start,
int32_t  length,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcTextAnother string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLengthThe number of code units from that string to compare.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3784 of file unistr.h.

int8_t icu::UnicodeString::compareCodePointOrder ( const UChar srcChars,
int32_t  srcLength 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
srcCharsA pointer to another string to compare this one to.
srcLengthThe number of code units from that string to compare.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3779 of file unistr.h.

References length().

int8_t icu::UnicodeString::compareCodePointOrder ( int32_t  start,
int32_t  length,
const UChar srcChars 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcCharsA pointer to another string to compare this one to.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3792 of file unistr.h.

int8_t icu::UnicodeString::compareCodePointOrder ( int32_t  start,
int32_t  length,
const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
startThe start offset in this string at which the compare operation begins.
lengthThe number of code units from this string to compare.
srcCharsA pointer to another string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLengthThe number of code units from that string to compare.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3798 of file unistr.h.

int8_t icu::UnicodeString::compareCodePointOrderBetween ( int32_t  start,
int32_t  limit,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLimit 
) const
inline

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator<, etc. if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff, which means that they compare as less than some other BMP characters like U+feff. This function compares Unicode strings in code point order. If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters
startThe start offset in this string at which the compare operation begins.
limitThe offset after the last code unit from this string to compare.
srcTextAnother string to compare this one to.
srcStartThe start offset in that string at which the compare operation begins.
srcLimitThe offset after the last code unit from that string to compare.
Returns
a negative/zero/positive integer corresponding to whether this string is less than/equal to/greater than the second one in code point order
Stable:
ICU 2.0

Definition at line 3806 of file unistr.h.

virtual void icu::UnicodeString::copy ( int32_t  start,
int32_t  limit,
int32_t  dest 
)
virtual

Copy a substring of this object, retaining attribute (out-of-band) information.

This method is used to duplicate or reorder substrings. The destination index must not overlap the source range.

Parameters
startthe beginning index, inclusive; 0 <= start <= limit.
limitthe ending index, exclusive; start <= limit <= length().
destthe destination index. The characters from start..limit-1 will be copied to dest. Implementations of this method may assume that dest <= start || dest >= limit.
Stable:
ICU 2.0

Implements icu::Replaceable.

int32_t icu::UnicodeString::countChar32 ( int32_t  start = 0,
int32_t  length = INT32_MAX 
) const

Count Unicode code points in the length UChar code units of the string.

A code point may occupy either one or two UChar code units. Counting code points involves reading all code units.

This functions is basically the inverse of moveIndex32().

Parameters
startthe index of the first code unit to check
lengththe number of UChar code units to check
Returns
the number of code points in the specified code units
See Also
length
Stable:
ICU 2.0

Referenced by icu::DecimalFormatSymbols::setSymbol().

UBool icu::UnicodeString::endsWith ( const UnicodeString text) const
inline

Determine if this ends with the characters in text

Parameters
textThe text to match.
Returns
TRUE if this ends with the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4081 of file unistr.h.

References length().

UBool icu::UnicodeString::endsWith ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Determine if this ends with the characters in srcText in the range [srcStart, srcStart + srcLength).

Parameters
srcTextThe text to match.
srcStartthe offset into srcText to start matching
srcLengththe number of characters in srcText to match
Returns
TRUE if this ends with the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4086 of file unistr.h.

References length().

UBool icu::UnicodeString::endsWith ( const UChar srcChars,
int32_t  srcLength 
) const
inline

Determine if this ends with the characters in srcChars

Parameters
srcCharsThe characters to match.
srcLengththe number of characters in srcChars
Returns
TRUE if this ends with the characters in srcChars, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4095 of file unistr.h.

References length(), and u_strlen().

UBool icu::UnicodeString::endsWith ( const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).

Parameters
srcCharsThe characters to match.
srcStartthe offset into srcText to start matching
srcLengththe number of characters in srcChars to match
Returns
TRUE if this ends with the characters in srcChars, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4105 of file unistr.h.

References length(), and u_strlen().

void icu::UnicodeString::extract ( int32_t  start,
int32_t  length,
UChar dst,
int32_t  dstStart = 0 
) const
inline

Copy the characters in the range [start, start + length) into the array dst, beginning at dstStart.

If the string aliases to dst itself as an external buffer, then extract() will not copy the contents.

Parameters
startoffset of first character which will be copied into the array
lengththe number of characters to extract
dstarray in which to copy characters. The length of dst must be at least (dstStart + length).
dstStartthe offset in dst where the first character will be extracted
Stable:
ICU 2.0

Definition at line 4191 of file unistr.h.

Referenced by extract().

int32_t icu::UnicodeString::extract ( UChar dest,
int32_t  destCapacity,
UErrorCode errorCode 
) const

Copy the contents of the string into dest.

This is a convenience function that checks if there is enough space in dest, extracts the entire string if possible, and NUL-terminates dest if possible.

If the string fits into dest but cannot be NUL-terminated (length()==destCapacity) then the error code is set to U_STRING_NOT_TERMINATED_WARNING. If the string itself does not fit into dest (length()>destCapacity) then the error code is set to U_BUFFER_OVERFLOW_ERROR.

If the string aliases to dest itself as an external buffer, then extract() will not copy the contents.

Parameters
destDestination string buffer.
destCapacityNumber of UChars available at dest.
errorCodeICU error code.
Returns
length()
Stable:
ICU 2.0
void icu::UnicodeString::extract ( int32_t  start,
int32_t  length,
UnicodeString target 
) const
inline

Copy the characters in the range [start, start + length) into the UnicodeString target.

Parameters
startoffset of first character which will be copied
lengththe number of characters to extract
targetUnicodeString into which to copy characters.
Returns
A reference to target
Stable:
ICU 2.0

Definition at line 4198 of file unistr.h.

int32_t icu::UnicodeString::extract ( int32_t  start,
int32_t  startLength,
char *  target,
int32_t  targetCapacity,
enum EInvariant  inv 
) const

Copy the characters in the range [start, start + length) into an array of characters.

All characters must be invariant (see utypes.h). Use US_INV as the last, signature-distinguishing parameter.

This function does not write any more than targetLength characters but returns the length of the entire output string so that one can allocate a larger buffer and call the function again if necessary. The output string is NUL-terminated if possible.

Parameters
startoffset of first character which will be copied
startLengththe number of characters to extract
targetthe target buffer for extraction, can be NULL if targetLength is 0
targetCapacitythe length of the target buffer
invSignature-distinguishing paramater, use US_INV.
Returns
the output string length, not including the terminating NUL
Stable:
ICU 3.2
int32_t icu::UnicodeString::extract ( int32_t  start,
int32_t  startLength,
char *  target,
uint32_t  targetLength 
) const

Copy the characters in the range [start, start + length) into an array of characters in the platform's default codepage.

This function does not write any more than targetLength characters but returns the length of the entire output string so that one can allocate a larger buffer and call the function again if necessary. The output string is NUL-terminated if possible.

Parameters
startoffset of first character which will be copied
startLengththe number of characters to extract
targetthe target buffer for extraction
targetLengththe length of the target buffer If target is NULL, then the number of bytes required for target is returned.
Returns
the output string length, not including the terminating NUL
Stable:
ICU 2.0
int32_t icu::UnicodeString::extract ( int32_t  start,
int32_t  startLength,
char *  target,
const char *  codepage = 0 
) const
inline

Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.

The output string is NUL-terminated.

Recommendation: For invariant-character strings use extract(int32_t start, int32_t length, char *target, int32_t targetCapacity, enum EInvariant inv) const because it avoids object code dependencies of UnicodeString on the conversion code.

Parameters
startoffset of first character which will be copied
startLengththe number of characters to extract
targetthe target buffer for extraction
codepagethe desired codepage for the characters. 0 has the special meaning of the default codepage If codepage is an empty string (""), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. If target is NULL, then the number of bytes required for target is returned. It is assumed that the target is big enough to fit all of the characters.
Returns
the output string length, not including the terminating NUL
Stable:
ICU 2.0

Definition at line 4206 of file unistr.h.

References extract().

int32_t icu::UnicodeString::extract ( int32_t  start,
int32_t  startLength,
char *  target,
uint32_t  targetLength,
const char *  codepage 
) const

Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.

This function does not write any more than targetLength characters but returns the length of the entire output string so that one can allocate a larger buffer and call the function again if necessary. The output string is NUL-terminated if possible.

Recommendation: For invariant-character strings use extract(int32_t start, int32_t length, char *target, int32_t targetCapacity, enum EInvariant inv) const because it avoids object code dependencies of UnicodeString on the conversion code.

Parameters
startoffset of first character which will be copied
startLengththe number of characters to extract
targetthe target buffer for extraction
targetLengththe length of the target buffer
codepagethe desired codepage for the characters. 0 has the special meaning of the default codepage If codepage is an empty string (""), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. If target is NULL, then the number of bytes required for target is returned.
Returns
the output string length, not including the terminating NUL
Stable:
ICU 2.0
int32_t icu::UnicodeString::extract ( char *  dest,
int32_t  destCapacity,
UConverter cnv,
UErrorCode errorCode 
) const

Convert the UnicodeString into a codepage string using an existing UConverter.

The output string is NUL-terminated if possible.

This function avoids the overhead of opening and closing a converter if multiple strings are extracted.

Parameters
destdestination string buffer, can be NULL if destCapacity==0
destCapacitythe number of chars available at dest
cnvthe converter object to be used (ucnv_resetFromUnicode() will be called), or NULL for the default converter
errorCodenormal ICU error code
Returns
the length of the output string, not counting the terminating NUL; if the length is greater than destCapacity, then the string will not fit and a buffer of the indicated length would need to be passed in
Stable:
ICU 2.0
void icu::UnicodeString::extractBetween ( int32_t  start,
int32_t  limit,
UChar dst,
int32_t  dstStart = 0 
) const
inline

Copy the characters in the range [start, limit) into the array dst, beginning at dstStart.

Parameters
startoffset of first character which will be copied into the array
limitoffset immediately following the last character to be copied
dstarray in which to copy characters. The length of dst must be at least (dstStart + (limit - start)).
dstStartthe offset in dst where the first character will be extracted
Stable:
ICU 2.0

Definition at line 4219 of file unistr.h.

virtual void icu::UnicodeString::extractBetween ( int32_t  start,
int32_t  limit,
UnicodeString target 
) const
virtual

Copy the characters in the range [start, limit) into the UnicodeString target.

Replaceable API.

Parameters
startoffset of first character which will be copied
limitoffset immediately following the last character to be copied
targetUnicodeString into which to copy characters.
Returns
A reference to target
Stable:
ICU 2.0

Implements icu::Replaceable.

UnicodeString& icu::UnicodeString::fastCopyFrom ( const UnicodeString src)

Almost the same as the assignment operator.

Replace the characters in this UnicodeString with the characters from srcText.

This function works the same as the assignment operator for all strings except for ones that are readonly aliases.

Starting with ICU 2.4, the assignment operator and the copy constructor allocate a new buffer and copy the buffer contents even for readonly aliases. This function implements the old, more efficient but less safe behavior of making this string also a readonly alias to the same buffer.

The fastCopyFrom function must be used only if it is known that the lifetime of this UnicodeString does not exceed the lifetime of the aliased buffer including its contents, for example for strings from resource bundles or aliases to string constants.

Parameters
srcThe text containing the characters to replace.
Returns
a reference to this
Stable:
ICU 2.4
UnicodeString & icu::UnicodeString::findAndReplace ( const UnicodeString oldText,
const UnicodeString newText 
)
inline

Replace all occurrences of characters in oldText with the characters in newText.

Parameters
oldTextthe text containing the search text
newTextthe text containing the replacement text
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4168 of file unistr.h.

References length().

Referenced by findAndReplace().

UnicodeString & icu::UnicodeString::findAndReplace ( int32_t  start,
int32_t  length,
const UnicodeString oldText,
const UnicodeString newText 
)
inline

Replace all occurrences of characters in oldText with characters in newText in the range [start, start + length).

Parameters
startthe start of the range in which replace will performed
lengththe length of the range in which replace will be performed
oldTextthe text containing the search text
newTextthe text containing the replacement text
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4174 of file unistr.h.

References findAndReplace(), and length().

UnicodeString& icu::UnicodeString::findAndReplace ( int32_t  start,
int32_t  length,
const UnicodeString oldText,
int32_t  oldStart,
int32_t  oldLength,
const UnicodeString newText,
int32_t  newStart,
int32_t  newLength 
)

Replace all occurrences of characters in oldText in the range [oldStart, oldStart + oldLength) with the characters in newText in the range [newStart, newStart + newLength) in the range [start, start + length).

Parameters
startthe start of the range in which replace will performed
lengththe length of the range in which replace will be performed
oldTextthe text containing the search text
oldStartthe start of the search range in oldText
oldLengththe length of the search range in oldText
newTextthe text containing the replacement text
newStartthe start of the replacement range in newText
newLengththe length of the replacement range in newText
Returns
a reference to this
Stable:
ICU 2.0
UnicodeString& icu::UnicodeString::foldCase ( uint32_t  options = 0)

Case-folds the characters in this string.

Case-folding is locale-independent and not context-sensitive, but there is an option for whether to include or exclude mappings for dotted I and dotless i that are marked with 'T' in CaseFolding.txt.

The result may be longer or shorter than the original.

Parameters
optionsEither U_FOLD_CASE_DEFAULT or U_FOLD_CASE_EXCLUDE_SPECIAL_I
Returns
A reference to this.
Stable:
ICU 2.0
static UnicodeString icu::UnicodeString::fromUTF32 ( const UChar32 utf32,
int32_t  length 
)
static

Create a UnicodeString from a UTF-32 string.

Illegal input is replaced with U+FFFD. Otherwise, errors result in a bogus string. Calls u_strFromUTF32WithSub().

Parameters
utf32UTF-32 input string. Must not be NULL.
lengthLength of the input string, or -1 if NUL-terminated.
Returns
A UnicodeString with equivalent UTF-16 contents.
See Also
toUTF32
Stable:
ICU 4.2
static UnicodeString icu::UnicodeString::fromUTF8 ( const StringPiece utf8)
static

Create a UnicodeString from a UTF-8 string.

Illegal input is replaced with U+FFFD. Otherwise, errors result in a bogus string. Calls u_strFromUTF8WithSub().

Parameters
utf8UTF-8 input string. Note that a StringPiece can be implicitly constructed from a std::string or a NUL-terminated const char * string.
Returns
A UnicodeString with equivalent UTF-16 contents.
See Also
toUTF8
toUTF8String
Stable:
ICU 4.2
UChar* icu::UnicodeString::getBuffer ( int32_t  minCapacity)

Get a read/write pointer to the internal buffer.

The buffer is guaranteed to be large enough for at least minCapacity UChars, writable, and is still owned by the UnicodeString object. Calls to getBuffer(minCapacity) must not be nested, and must be matched with calls to releaseBuffer(newLength). If the string buffer was read-only or shared, then it will be reallocated and copied.

An attempted nested call will return 0, and will not further modify the state of the UnicodeString object. It also returns 0 if the string is bogus.

The actual capacity of the string buffer may be larger than minCapacity. getCapacity() returns the actual capacity. For many operations, the full capacity should be used to avoid reallocations.

While the buffer is "open" between getBuffer(minCapacity) and releaseBuffer(newLength), the following applies:

  • The string length is set to 0.
  • Any read API call on the UnicodeString object will behave like on a 0-length string.
  • Any write API call on the UnicodeString object is disallowed and will have no effect.
  • You can read from and write to the returned buffer.
  • The previous string contents will still be in the buffer; if you want to use it, then you need to call length() before getBuffer(minCapacity). If the length() was greater than minCapacity, then any contents after minCapacity may be lost. The buffer contents is not NUL-terminated by getBuffer(). If length()<getCapacity() then you can terminate it by writing a NUL at index length().
  • You must call releaseBuffer(newLength) before and in order to return to normal UnicodeString operation.
Parameters
minCapacitythe minimum number of UChars that are to be available in the buffer, starting at the returned pointer; default to the current string capacity if minCapacity==-1
Returns
a writable pointer to the internal string buffer, or 0 if an error occurs (nested calls, out of memory)
See Also
releaseBuffer
getTerminatedBuffer()
Stable:
ICU 2.0

Referenced by icu::Normalizer::compare(), icu::UnicodeSet::span(), and icu::UnicodeSet::spanBack().

const UChar * icu::UnicodeString::getBuffer ( ) const
inline

Get a read-only pointer to the internal buffer.

This can be called at any time on a valid UnicodeString.

It returns 0 if the string is bogus, or during an "open" getBuffer(minCapacity).

It can be called as many times as desired. The pointer that it returns will remain valid until the UnicodeString object is modified, at which time the pointer is semantically invalidated and must not be used any more.

The capacity of the buffer can be determined with getCapacity(). The part after length() may or may not be initialized and valid, depending on the history of the UnicodeString object.

The buffer contents is (probably) not NUL-terminated. You can check if it is with (s.length()<s.getCapacity() && buffer[s.length()]==0). (See getTerminatedBuffer().)

The buffer may reside in read-only memory. Its contents must not be modified.

Returns
a read-only pointer to the internal string buffer, or 0 if the string is empty or bogus
See Also
getBuffer(int32_t minCapacity)
getTerminatedBuffer()
Stable:
ICU 2.0

Definition at line 3648 of file unistr.h.

int32_t icu::UnicodeString::getCapacity ( void  ) const
inline

Return the capacity of the internal buffer of the UnicodeString object.

This is useful together with the getBuffer functions. See there for details.

Returns
the number of UChars available in the internal buffer
See Also
getBuffer
Stable:
ICU 2.0

Definition at line 3624 of file unistr.h.

virtual UChar32 icu::UnicodeString::getChar32At ( int32_t  offset) const
protectedvirtual

The change in Replaceable to use virtual getChar32At() allows UnicodeString::char32At() to be inline again (see jitterbug 709).

Stable:
ICU 2.4

Implements icu::Replaceable.

int32_t icu::UnicodeString::getChar32Limit ( int32_t  offset) const

Adjust a random-access offset so that it points behind a Unicode character.

The offset that is passed in points behind any code unit of a code point, while the returned offset will point behind the last code unit of the same code point. In UTF-16, if the input offset points behind the first surrogate (i.e., to the second surrogate) of a surrogate pair, then the returned offset will point behind the second surrogate (i.e., to the first surrogate).

Parameters
offseta valid offset after any code unit of a code point of the text
Returns
offset of the first code unit after the same code point
See Also
U16_SET_CP_LIMIT
Stable:
ICU 2.0
int32_t icu::UnicodeString::getChar32Start ( int32_t  offset) const

Adjust a random-access offset so that it points to the beginning of a Unicode character.

The offset that is passed in points to any code unit of a code point, while the returned offset will point to the first code unit of the same code point. In UTF-16, if the input offset points to a second surrogate of a surrogate pair, then the returned offset will point to the first surrogate.

Parameters
offseta valid offset into one code point of the text
Returns
offset of the first code unit of the same code point
See Also
U16_SET_CP_START
Stable:
ICU 2.0
virtual UChar icu::UnicodeString::getCharAt ( int32_t  offset) const
protectedvirtual

The change in Replaceable to use virtual getCharAt() allows UnicodeString::charAt() to be inline again (see jitterbug 709).

Stable:
ICU 2.4

Implements icu::Replaceable.

virtual UClassID icu::UnicodeString::getDynamicClassID ( ) const
virtual

ICU "poor man's RTTI", returns a UClassID for the actual class.

Stable:
ICU 2.2

Reimplemented from icu::UObject.

virtual int32_t icu::UnicodeString::getLength ( ) const
protectedvirtual

Implement Replaceable::getLength() (see jitterbug 1027).

Stable:
ICU 2.4

Implements icu::Replaceable.

static UClassID icu::UnicodeString::getStaticClassID ( )
static

ICU "poor man's RTTI", returns a UClassID for this class.

Stable:
ICU 2.2
const UChar* icu::UnicodeString::getTerminatedBuffer ( )

Get a read-only pointer to the internal buffer, making sure that it is NUL-terminated.

This can be called at any time on a valid UnicodeString.

It returns 0 if the string is bogus, or during an "open" getBuffer(minCapacity), or if the buffer cannot be NUL-terminated (because memory allocation failed).

It can be called as many times as desired. The pointer that it returns will remain valid until the UnicodeString object is modified, at which time the pointer is semantically invalidated and must not be used any more.

The capacity of the buffer can be determined with getCapacity(). The part after length()+1 may or may not be initialized and valid, depending on the history of the UnicodeString object.

The buffer contents is guaranteed to be NUL-terminated. getTerminatedBuffer() may reallocate the buffer if a terminating NUL is written. For this reason, this function is not const, unlike getBuffer(). Note that a UnicodeString may also contain NUL characters as part of its contents.

The buffer may reside in read-only memory. Its contents must not be modified.

Returns
a read-only pointer to the internal string buffer, or 0 if the string is empty or bogus
See Also
getBuffer(int32_t minCapacity)
getBuffer()
Stable:
ICU 2.2
virtual void icu::UnicodeString::handleReplaceBetween ( int32_t  start,
int32_t  limit,
const UnicodeString text 
)
virtual

Replace a substring of this object with the given text.

Parameters
startthe beginning index, inclusive; 0 <= start <= limit.
limitthe ending index, exclusive; start <= limit <= length().
textthe text to replace characters start to limit - 1
Stable:
ICU 2.0

Implements icu::Replaceable.

int32_t icu::UnicodeString::hashCode ( void  ) const
inline

Generate a hash code for this object.

Returns
The hash code of this UnicodeString.
Stable:
ICU 2.0

Definition at line 3628 of file unistr.h.

virtual UBool icu::UnicodeString::hasMetaData ( ) const
virtual

Replaceable API.

Returns
TRUE if it has MetaData
Stable:
ICU 2.4

Reimplemented from icu::Replaceable.

UBool icu::UnicodeString::hasMoreChar32Than ( int32_t  start,
int32_t  length,
int32_t  number 
) const

Check if the length UChar code units of the string contain more Unicode code points than a certain number.

This is more efficient than counting all code points in this part of the string and comparing that number with a threshold. This function may not need to scan the string at all if the length falls within a certain range, and never needs to count more than 'number+1' code points. Logically equivalent to (countChar32(start, length)>number). A Unicode code point may occupy either one or two UChar code units.

Parameters
startthe index of the first code unit to check (0 for the entire string)
lengththe number of UChar code units to check (use INT32_MAX for the entire string; remember that start/length values are pinned)
numberThe number of code points in the (sub)string is compared against the 'number' parameter.
Returns
Boolean value for whether the string contains more Unicode code points than 'number'. Same as (u_countChar32(s, length)>number).
See Also
countChar32
u_strHasMoreChar32Than
Stable:
ICU 2.4
int32_t icu::UnicodeString::indexOf ( const UnicodeString text) const
inline

Locate in this the first occurrence of the characters in text, using bitwise comparison.

Parameters
textThe text to search for.
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3905 of file unistr.h.

References length().

Referenced by indexOf().

int32_t icu::UnicodeString::indexOf ( const UnicodeString text,
int32_t  start 
) const
inline

Locate in this the first occurrence of the characters in text starting at offset start, using bitwise comparison.

Parameters
textThe text to search for.
startThe offset at which searching will start.
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3909 of file unistr.h.

References indexOf(), and length().

int32_t icu::UnicodeString::indexOf ( const UnicodeString text,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the first occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.

Parameters
textThe text to search for.
startThe offset at which searching will start.
lengthThe number of characters to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3916 of file unistr.h.

References indexOf(), and length().

int32_t icu::UnicodeString::indexOf ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the first occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.

Parameters
srcTextThe text to search for.
srcStartthe offset into srcText at which to start matching
srcLengththe number of characters in srcText to match
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3889 of file unistr.h.

References indexOf(), and isBogus().

int32_t icu::UnicodeString::indexOf ( const UChar srcChars,
int32_t  srcLength,
int32_t  start 
) const
inline

Locate in this the first occurrence of the characters in srcChars starting at offset start, using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcLengththe number of characters in srcChars to match
startthe offset into this at which to start matching
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3922 of file unistr.h.

References indexOf(), and length().

int32_t icu::UnicodeString::indexOf ( const UChar srcChars,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcLengththe number of characters in srcChars
startThe offset at which searching will start.
lengthThe number of characters to search
Returns
The offset into this of the start of srcChars, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3930 of file unistr.h.

References indexOf().

int32_t icu::UnicodeString::indexOf ( const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const

Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcStartthe offset into srcChars at which to start matching
srcLengththe number of characters in srcChars to match
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0
int32_t icu::UnicodeString::indexOf ( UChar  c) const
inline

Locate in this the first occurrence of the BMP code point c, using bitwise comparison.

Parameters
cThe code unit to search for.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3949 of file unistr.h.

References length().

int32_t icu::UnicodeString::indexOf ( UChar32  c) const
inline

Locate in this the first occurrence of the code point c, using bitwise comparison.

Parameters
cThe code point to search for.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3953 of file unistr.h.

References indexOf(), and length().

int32_t icu::UnicodeString::indexOf ( UChar  c,
int32_t  start 
) const
inline

Locate in this the first occurrence of the BMP code point c, starting at offset start, using bitwise comparison.

Parameters
cThe code unit to search for.
startThe offset at which searching will start.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3957 of file unistr.h.

References length().

int32_t icu::UnicodeString::indexOf ( UChar32  c,
int32_t  start 
) const
inline

Locate in this the first occurrence of the code point c starting at offset start, using bitwise comparison.

Parameters
cThe code point to search for.
startThe offset at which searching will start.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3964 of file unistr.h.

References indexOf(), and length().

int32_t icu::UnicodeString::indexOf ( UChar  c,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the first occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.

Parameters
cThe code unit to search for.
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3937 of file unistr.h.

int32_t icu::UnicodeString::indexOf ( UChar32  c,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the first occurrence of the code point c in the range [start, start + length), using bitwise comparison.

Parameters
cThe code point to search for.
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3943 of file unistr.h.

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Insert the characters in srcText in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.

srcText is not modified.

Parameters
startthe offset where the insertion begins
srcTextthe source for the new characters
srcStartthe offset into srcText where new characters will be obtained
srcLengththe number of characters in srcText in the insert string
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4375 of file unistr.h.

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
const UnicodeString srcText 
)
inline

Insert the characters in srcText into the UnicodeString object at offset start.

srcText is not modified.

Parameters
startthe offset where the insertion begins
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4382 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Insert the characters in srcChars in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.

srcChars is not modified.

Parameters
startthe offset at which the insertion begins
srcCharsthe source for the new characters
srcStartthe offset into srcChars where new characters will be obtained
srcLengththe number of characters in srcChars in the insert string
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4387 of file unistr.h.

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
const UChar srcChars,
int32_t  srcLength 
)
inline

Insert the characters in srcChars into the UnicodeString object at offset start.

srcChars is not modified.

Parameters
startthe offset where the insertion begins
srcCharsthe source for the new characters
srcLengththe number of Unicode characters in srcChars.
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4394 of file unistr.h.

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
UChar  srcChar 
)
inline

Insert the code unit srcChar into the UnicodeString object at offset start.

Parameters
startthe offset at which the insertion occurs
srcCharthe code unit to insert
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4400 of file unistr.h.

UnicodeString & icu::UnicodeString::insert ( int32_t  start,
UChar32  srcChar 
)
inline

Insert the code point srcChar into the UnicodeString object at offset start.

Parameters
startthe offset at which the insertion occurs
srcCharthe code point to insert
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4405 of file unistr.h.

References replace().

UBool icu::UnicodeString::isBogus ( void  ) const
inline

Determine if this object contains a valid string.

A bogus string has no value. It is different from an empty string, although in both cases isEmpty() returns TRUE and length() returns 0. setToBogus() and isBogus() can be used to indicate that no string value is available. For a bogus string, getBuffer() and getTerminatedBuffer() return NULL, and length() returns 0.

Returns
TRUE if the string is bogus/invalid, FALSE otherwise
See Also
setToBogus()
Stable:
ICU 2.0

Definition at line 3632 of file unistr.h.

Referenced by indexOf(), lastIndexOf(), operator==(), remove(), and truncate().

UBool icu::UnicodeString::isEmpty ( void  ) const
inline

Determine if this string is empty.

Returns
TRUE if this string contains 0 characters, FALSE otherwise.
Stable:
ICU 2.0

Definition at line 4252 of file unistr.h.

int32_t icu::UnicodeString::lastIndexOf ( const UnicodeString text) const
inline

Locate in this the last occurrence of the characters in text, using bitwise comparison.

Parameters
textThe text to search for.
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4015 of file unistr.h.

References length().

Referenced by lastIndexOf().

int32_t icu::UnicodeString::lastIndexOf ( const UnicodeString text,
int32_t  start 
) const
inline

Locate in this the last occurrence of the characters in text starting at offset start, using bitwise comparison.

Parameters
textThe text to search for.
startThe offset at which searching will start.
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4008 of file unistr.h.

References lastIndexOf(), and length().

int32_t icu::UnicodeString::lastIndexOf ( const UnicodeString text,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the last occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.

Parameters
textThe text to search for.
startThe offset at which searching will start.
lengthThe number of characters to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4002 of file unistr.h.

References lastIndexOf(), and length().

int32_t icu::UnicodeString::lastIndexOf ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the last occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.

Parameters
srcTextThe text to search for.
srcStartthe offset into srcText at which to start matching
srcLengththe number of characters in srcText to match
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3986 of file unistr.h.

References isBogus(), and lastIndexOf().

int32_t icu::UnicodeString::lastIndexOf ( const UChar srcChars,
int32_t  srcLength,
int32_t  start 
) const
inline

Locate in this the last occurrence of the characters in srcChars starting at offset start, using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcLengththe number of characters in srcChars to match
startthe offset into this at which to start matching
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3978 of file unistr.h.

References lastIndexOf(), and length().

int32_t icu::UnicodeString::lastIndexOf ( const UChar srcChars,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcLengththe number of characters in srcChars
startThe offset at which searching will start.
lengthThe number of characters to search
Returns
The offset into this of the start of srcChars, or -1 if not found.
Stable:
ICU 2.0

Definition at line 3971 of file unistr.h.

References lastIndexOf().

int32_t icu::UnicodeString::lastIndexOf ( const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength,
int32_t  start,
int32_t  length 
) const

Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.

Parameters
srcCharsThe text to search for.
srcStartthe offset into srcChars at which to start matching
srcLengththe number of characters in srcChars to match
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of the start of text, or -1 if not found.
Stable:
ICU 2.0
int32_t icu::UnicodeString::lastIndexOf ( UChar  c) const
inline

Locate in this the last occurrence of the BMP code point c, using bitwise comparison.

Parameters
cThe code unit to search for.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4032 of file unistr.h.

References length().

int32_t icu::UnicodeString::lastIndexOf ( UChar32  c) const
inline

Locate in this the last occurrence of the code point c, using bitwise comparison.

Parameters
cThe code point to search for.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4036 of file unistr.h.

References lastIndexOf(), and length().

int32_t icu::UnicodeString::lastIndexOf ( UChar  c,
int32_t  start 
) const
inline

Locate in this the last occurrence of the BMP code point c starting at offset start, using bitwise comparison.

Parameters
cThe code unit to search for.
startThe offset at which searching will start.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4041 of file unistr.h.

References length().

int32_t icu::UnicodeString::lastIndexOf ( UChar32  c,
int32_t  start 
) const
inline

Locate in this the last occurrence of the code point c starting at offset start, using bitwise comparison.

Parameters
cThe code point to search for.
startThe offset at which searching will start.
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4048 of file unistr.h.

References lastIndexOf(), and length().

int32_t icu::UnicodeString::lastIndexOf ( UChar  c,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the last occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.

Parameters
cThe code unit to search for.
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4019 of file unistr.h.

int32_t icu::UnicodeString::lastIndexOf ( UChar32  c,
int32_t  start,
int32_t  length 
) const
inline

Locate in this the last occurrence of the code point c in the range [start, start + length), using bitwise comparison.

Parameters
cThe code point to search for.
startthe offset into this at which to start matching
lengththe number of characters in this to search
Returns
The offset into this of c, or -1 if not found.
Stable:
ICU 2.0

Definition at line 4025 of file unistr.h.

int32_t icu::UnicodeString::length ( void  ) const
inline

Return the length of the UnicodeString object.

The length is the number of UChar code units are in the UnicodeString. If you want the number of code points, please use countChar32().

Returns
the length of the UnicodeString object
See Also
countChar32
Stable:
ICU 2.0

Reimplemented from icu::Replaceable.

Definition at line 3620 of file unistr.h.

Referenced by append(), caseCompare(), compare(), icu::Normalizer::compare(), compareCodePointOrder(), endsWith(), findAndReplace(), indexOf(), insert(), lastIndexOf(), operator+=(), operator<(), operator<=(), operator=(), operator==(), operator>(), operator>=(), replace(), replaceBetween(), reverse(), setTo(), icu::UnicodeSet::span(), icu::UnicodeSet::spanBack(), startsWith(), and truncate().

int32_t icu::UnicodeString::moveIndex32 ( int32_t  index,
int32_t  delta 
) const

Move the code unit index along the string by delta code points.

Interpret the input index as a code unit-based offset into the string, move the index forward or backward by delta code points, and return the resulting index. The input index should point to the first code unit of a code point, if there is more than one.

Both input and output indexes are code unit-based as for all string indexes/offsets in ICU (and other libraries, like MBCS char*). If delta<0 then the index is moved backward (toward the start of the string). If delta>0 then the index is moved forward (toward the end of the string).

This behaves like CharacterIterator::move32(delta, kCurrent).

Behavior for out-of-bounds indexes: moveIndex32 pins the input index to 0..length(), i.e., if the input index<0 then it is pinned to 0; if it is index>length() then it is pinned to length(). Afterwards, the index is moved by delta code points forward or backward, but no further backward than to 0 and no further forward than to length(). The resulting index return value will be in between 0 and length(), inclusively.

Examples:

// s has code points 'a' U+10000 'b' U+10ffff U+2029
UnicodeString s=UNICODE_STRING("a\\U00010000b\\U0010ffff\\u2029", 31).unescape();
// initial index: position of U+10000
int32_t index=1;
// the following examples will all result in index==4, position of U+10ffff
// skip 2 code points from some position in the string
index=s.moveIndex32(index, 2); // skips U+10000 and 'b'
// go to the 3rd code point from the start of s (0-based)
index=s.moveIndex32(0, 3); // skips 'a', U+10000, and 'b'
// go to the next-to-last code point of s
index=s.moveIndex32(s.length(), -2); // backward-skips U+2029 and U+10ffff
Parameters
indexinput code unit index
delta(signed) code point count to move the index forward or backward in the string
Returns
the resulting code unit index
Stable:
ICU 2.0
UBool icu::UnicodeString::operator!= ( const UnicodeString text) const
inline

Inequality operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
FALSE if text contains the same characters as this one, TRUE otherwise.
Stable:
ICU 2.0

Definition at line 3688 of file unistr.h.

UnicodeString & icu::UnicodeString::operator+= ( UChar  ch)
inline

Append operator.

Append the code unit ch to the UnicodeString object.

Parameters
chthe code unit to be appended
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4362 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::operator+= ( UChar32  ch)
inline

Append operator.

Append the code point ch to the UnicodeString object.

Parameters
chthe code point to be appended
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4366 of file unistr.h.

References append().

UnicodeString & icu::UnicodeString::operator+= ( const UnicodeString srcText)
inline

Append operator.

Append the characters in srcText to the UnicodeString object. srcText is not modified.

Parameters
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4371 of file unistr.h.

References length().

UBool icu::UnicodeString::operator< ( const UnicodeString text) const
inline

Less than operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
TRUE if the characters in this are bitwise less than the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 3696 of file unistr.h.

References length().

UBool icu::UnicodeString::operator<= ( const UnicodeString text) const
inline

Less than or equal operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
TRUE if the characters in this are bitwise less than or equal to the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 3704 of file unistr.h.

References length().

UnicodeString& icu::UnicodeString::operator= ( const UnicodeString srcText)

Assignment operator.

Replace the characters in this UnicodeString with the characters from srcText.

Parameters
srcTextThe text containing the characters to replace
Returns
a reference to this
Stable:
ICU 2.0
UnicodeString & icu::UnicodeString::operator= ( UChar  ch)
inline

Assignment operator.

Replace the characters in this UnicodeString with the code unit ch.

Parameters
chthe code unit to replace
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4283 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::operator= ( UChar32  ch)
inline

Assignment operator.

Replace the characters in this UnicodeString with the code point ch.

Parameters
chthe code point to replace
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4287 of file unistr.h.

References length(), and replace().

UBool icu::UnicodeString::operator== ( const UnicodeString text) const
inline

Equality operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
TRUE if text contains the same characters as this one, FALSE otherwise.
Stable:
ICU 2.0

Definition at line 3677 of file unistr.h.

References isBogus(), and length().

UBool icu::UnicodeString::operator> ( const UnicodeString text) const
inline

Greater than operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
TRUE if the characters in this are bitwise greater than the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 3692 of file unistr.h.

References length().

UBool icu::UnicodeString::operator>= ( const UnicodeString text) const
inline

Greater than or equal operator.

Performs only bitwise comparison.

Parameters
textThe UnicodeString to compare to this one.
Returns
TRUE if the characters in this are bitwise greater than or equal to the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 3700 of file unistr.h.

References length().

UChar icu::UnicodeString::operator[] ( int32_t  offset) const
inline

Return the code unit at offset offset.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters
offseta valid offset into the text
Returns
the code unit at offset offset
Stable:
ICU 2.0

Definition at line 4248 of file unistr.h.

UBool icu::UnicodeString::padLeading ( int32_t  targetLength,
UChar  padChar = 0x0020 
)

Pad the start of this UnicodeString with the character padChar.

If the length of this UnicodeString is less than targetLength, length() - targetLength copies of padChar will be added to the beginning of this UnicodeString.

Parameters
targetLengththe desired length of the string
padCharthe character to use for padding. Defaults to space (U+0020)
Returns
TRUE if the text was padded, FALSE otherwise.
Stable:
ICU 2.0
UBool icu::UnicodeString::padTrailing ( int32_t  targetLength,
UChar  padChar = 0x0020 
)

Pad the end of this UnicodeString with the character padChar.

If the length of this UnicodeString is less than targetLength, length() - targetLength copies of padChar will be added to the end of this UnicodeString.

Parameters
targetLengththe desired length of the string
padCharthe character to use for padding. Defaults to space (U+0020)
Returns
TRUE if the text was padded, FALSE otherwise.
Stable:
ICU 2.0
void icu::UnicodeString::releaseBuffer ( int32_t  newLength = -1)

Release a read/write buffer on a UnicodeString object with an "open" getBuffer(minCapacity).

This function must be called in a matched pair with getBuffer(minCapacity). releaseBuffer(newLength) must be called if and only if a getBuffer(minCapacity) is "open".

It will set the string length to newLength, at most to the current capacity. If newLength==-1 then it will set the length according to the first NUL in the buffer, or to the capacity if there is no NUL.

After calling releaseBuffer(newLength) the UnicodeString is back to normal operation.

Parameters
newLengththe new length of the UnicodeString object; defaults to the current capacity if newLength is greater than that; if newLength==-1, it defaults to u_strlen(buffer) but not more than the current capacity of the string
See Also
getBuffer(int32_t minCapacity)
Stable:
ICU 2.0
UnicodeString & icu::UnicodeString::remove ( void  )
inline

Remove all characters from the UnicodeString object.

Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4411 of file unistr.h.

References isBogus().

UnicodeString & icu::UnicodeString::remove ( int32_t  start,
int32_t  length = (int32_t)INT32_MAX 
)
inline

Remove the characters in the range [start, start + length) from the UnicodeString object.

Parameters
startthe offset of the first character to remove
lengththe number of characters to remove
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4423 of file unistr.h.

References INT32_MAX, and NULL.

UnicodeString & icu::UnicodeString::removeBetween ( int32_t  start,
int32_t  limit = (int32_t)INT32_MAX 
)
inline

Remove the characters in the range [start, limit) from the UnicodeString object.

Parameters
startthe offset of the first character to remove
limitthe offset immediately following the range to remove
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4434 of file unistr.h.

References NULL.

UnicodeString & icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Replace the characters in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).

srcText is not modified.

Parameters
startthe offset at which the replace operation begins
lengththe number of characters to replace. The character at start + length is not modified.
srcTextthe source for the new characters
srcStartthe offset into srcText where new characters will be obtained
srcLengththe number of characters in srcText in the replace string
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4125 of file unistr.h.

Referenced by insert(), operator=(), and setTo().

UnicodeString & icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
const UnicodeString srcText 
)
inline

Replace the characters in the range [start, start + length) with the characters in srcText.

srcText is not modified.

Parameters
startthe offset at which the replace operation begins
lengththe number of characters to replace. The character at start + length is not modified.
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4119 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Replace the characters in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).

srcChars is not modified.

Parameters
startthe offset at which the replace operation begins
lengththe number of characters to replace. The character at start + length is not modified.
srcCharsthe source for the new characters
srcStartthe offset into srcChars where new characters will be obtained
srcLengththe number of characters in srcChars in the replace string
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4140 of file unistr.h.

UnicodeString & icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
const UChar srcChars,
int32_t  srcLength 
)
inline

Replace the characters in the range [start, start + length) with the characters in srcChars.

srcChars is not modified.

Parameters
startthe offset at which the replace operation begins
lengthnumber of characters to replace. The character at start + length is not modified.
srcCharsthe source for the new characters
srcLengththe number of Unicode characters in srcChars
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4133 of file unistr.h.

UnicodeString & icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
UChar  srcChar 
)
inline

Replace the characters in the range [start, start + length) with the code unit srcChar.

Parameters
startthe offset at which the replace operation begins
lengththe number of characters to replace. The character at start + length is not modified.
srcCharthe new code unit
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4148 of file unistr.h.

UnicodeString& icu::UnicodeString::replace ( int32_t  start,
int32_t  length,
UChar32  srcChar 
)

Replace the characters in the range [start, start + length) with the code point srcChar.

Parameters
startthe offset at which the replace operation begins
lengththe number of characters to replace. The character at start + length is not modified.
srcCharthe new code point
Returns
a reference to this
Stable:
ICU 2.0
UnicodeString & icu::UnicodeString::replaceBetween ( int32_t  start,
int32_t  limit,
const UnicodeString srcText 
)
inline

Replace the characters in the range [start, limit) with the characters in srcText.

srcText is not modified.

Parameters
startthe offset at which the replace operation begins
limitthe offset immediately following the replace range
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4154 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::replaceBetween ( int32_t  start,
int32_t  limit,
const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLimit 
)
inline

Replace the characters in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).

srcText is not modified.

Parameters
startthe offset at which the replace operation begins
limitthe offset immediately following the replace range
srcTextthe source for the new characters
srcStartthe offset into srcChars where new characters will be obtained
srcLimitthe offset immediately following the range to copy in srcText
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4160 of file unistr.h.

UnicodeString & icu::UnicodeString::retainBetween ( int32_t  start,
int32_t  limit = INT32_MAX 
)
inline

Retain only the characters in the range [start, limit) from the UnicodeString object.

Removes characters before start and at and after limit.

Parameters
startthe offset of the first character to retain
limitthe offset immediately following the range to retain
Returns
a reference to this
Stable:
ICU 4.4

Definition at line 4439 of file unistr.h.

References NULL, and truncate().

UnicodeString & icu::UnicodeString::reverse ( void  )
inline

Reverse this UnicodeString in place.

Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4460 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::reverse ( int32_t  start,
int32_t  length 
)
inline

Reverse the range [start, start + length) in this UnicodeString.

Parameters
startthe start of the range to reverse
lengththe number of characters to to reverse
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4464 of file unistr.h.

UnicodeString& icu::UnicodeString::setCharAt ( int32_t  offset,
UChar  ch 
)

Set the character at the specified offset to the specified character.

Parameters
offsetA valid offset into the text of the character to set
chThe new character
Returns
A reference to this
Stable:
ICU 2.0
UnicodeString & icu::UnicodeString::setTo ( const UnicodeString srcText,
int32_t  srcStart 
)
inline

Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcText.length()).

srcText is not modified.

Parameters
srcTextthe source for the new characters
srcStartthe offset into srcText where new characters will be obtained
Returns
a reference to this
Stable:
ICU 2.2

Definition at line 4300 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::setTo ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
)
inline

Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcStart + srcLength).

srcText is not modified.

Parameters
srcTextthe source for the new characters
srcStartthe offset into srcText where new characters will be obtained
srcLengththe number of characters in srcText in the replace string.
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4291 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::setTo ( const UnicodeString srcText)
inline

Set the text in the UnicodeString object to the characters in srcText.

srcText is not modified.

Parameters
srcTextthe source for the new characters
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4309 of file unistr.h.

UnicodeString & icu::UnicodeString::setTo ( const UChar srcChars,
int32_t  srcLength 
)
inline

Set the characters in the UnicodeString object to the characters in srcChars.

srcChars is not modified.

Parameters
srcCharsthe source for the new characters
srcLengththe number of Unicode characters in srcChars.
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4315 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::setTo ( UChar  srcChar)
inline

Set the characters in the UnicodeString object to the code unit srcChar.

Parameters
srcCharthe code unit which becomes the UnicodeString's character content
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4323 of file unistr.h.

References length().

UnicodeString & icu::UnicodeString::setTo ( UChar32  srcChar)
inline

Set the characters in the UnicodeString object to the code point srcChar.

Parameters
srcCharthe code point which becomes the UnicodeString's character content
Returns
a reference to this
Stable:
ICU 2.0

Definition at line 4330 of file unistr.h.

References length(), and replace().

UnicodeString& icu::UnicodeString::setTo ( UBool  isTerminated,
const UChar text,
int32_t  textLength 
)

Aliasing setTo() function, analogous to the readonly-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has copy-on-write semantics: When the string is modified, then the buffer is first copied into newly allocated memory. The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor or the assignment operator, the text will be copied. When using fastCopyFrom(), the text will be aliased again, so that both strings then alias the same readonly-text.

Parameters
isTerminatedspecifies if text is NUL-terminated. This must be true if textLength==-1.
textThe characters to alias for the UnicodeString.
textLengthThe number of Unicode characters in text to alias. If -1, then this constructor will determine the length by calling u_strlen().
Returns
a reference to this
Stable:
ICU 2.0
UnicodeString& icu::UnicodeString::setTo ( UChar buffer,
int32_t  buffLength,
int32_t  buffCapacity 
)

Aliasing setTo() function, analogous to the writable-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has write-through semantics: For as long as the capacity of the buffer is sufficient, write operations will directly affect the buffer. When more capacity is necessary, then a new buffer will be allocated and the contents copied as with regularly constructed strings. In an assignment to another UnicodeString, the buffer will be copied. The extract(UChar *dst) function detects whether the dst pointer is the same as the string buffer itself and will in this case not copy the contents.

Parameters
bufferThe characters to alias for the UnicodeString.
buffLengthThe number of Unicode characters in buffer to alias.
buffCapacityThe size of buffer in UChars.
Returns
a reference to this
Stable:
ICU 2.0
void icu::UnicodeString::setToBogus ( )

Make this UnicodeString object invalid.

The string will test TRUE with isBogus().

A bogus string has no value. It is different from an empty string. It can be used to indicate that no string value is available. getBuffer() and getTerminatedBuffer() return NULL, and length() returns 0.

This utility function is used throughout the UnicodeString implementation to indicate that a UnicodeString operation failed, and may be used in other functions, especially but not exclusively when such functions do not take a UErrorCode for simplicity.

The following methods, and no others, will clear a string object's bogus flag:

The simplest ways to turn a bogus string into an empty one is to use the remove() function. Examples for other functions that are equivalent to "set to empty string":

if(s.isBogus()) {
s.remove(); // set to an empty string (remove all), or
s.remove(0, INT32_MAX); // set to an empty string (remove all), or
s.truncate(0); // set to an empty string (complete truncation), or
s=UnicodeString(); // assign an empty string, or
s.setTo((UChar32)-1); // set to a pseudo code point that is out of range, or
static const UChar nul=0;
s.setTo(&nul, 0); // set to an empty C Unicode string
}
See Also
isBogus()
Stable:
ICU 2.0
UBool icu::UnicodeString::startsWith ( const UnicodeString text) const
inline

Determine if this starts with the characters in text

Parameters
textThe text to match.
Returns
TRUE if this starts with the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4055 of file unistr.h.

References compare(), and length().

UBool icu::UnicodeString::startsWith ( const UnicodeString srcText,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Determine if this starts with the characters in srcText in the range [srcStart, srcStart + srcLength).

Parameters
srcTextThe text to match.
srcStartthe offset into srcText to start matching
srcLengththe number of characters in srcText to match
Returns
TRUE if this starts with the characters in text, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4059 of file unistr.h.

UBool icu::UnicodeString::startsWith ( const UChar srcChars,
int32_t  srcLength 
) const
inline

Determine if this starts with the characters in srcChars

Parameters
srcCharsThe characters to match.
srcLengththe number of characters in srcChars
Returns
TRUE if this starts with the characters in srcChars, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4065 of file unistr.h.

References u_strlen().

UBool icu::UnicodeString::startsWith ( const UChar srcChars,
int32_t  srcStart,
int32_t  srcLength 
) const
inline

Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).

Parameters
srcCharsThe characters to match.
srcStartthe offset into srcText to start matching
srcLengththe number of characters in srcChars to match
Returns
TRUE if this ends with the characters in srcChars, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4073 of file unistr.h.

References u_strlen().

UnicodeString icu::UnicodeString::tempSubString ( int32_t  start = 0,
int32_t  length = INT32_MAX 
) const

Create a temporary substring for the specified range.

Unlike the substring constructor and setTo() functions, the object returned here will be a read-only alias (using getBuffer()) rather than copying the text. As a result, this substring operation is much faster but requires that the original string not be modified or deleted during the lifetime of the returned substring object.

Parameters
startoffset of the first character visible in the substring
lengthlength of the substring
Returns
a read-only alias UnicodeString object for the substring
Stable:
ICU 4.4

Referenced by icu::MessagePattern::getSubstring(), and tempSubStringBetween().

UnicodeString icu::UnicodeString::tempSubStringBetween ( int32_t  start,
int32_t  limit = INT32_MAX 
) const
inline

Create a temporary substring for the specified range.

Same as tempSubString(start, length) except that the substring range is specified as a (start, limit) pair (with an exclusive limit index) rather than a (start, length) pair.

Parameters
startoffset of the first character visible in the substring
limitoffset immediately following the last character visible in the substring
Returns
a read-only alias UnicodeString object for the substring
Stable:
ICU 4.4

Definition at line 4229 of file unistr.h.

References tempSubString().

UnicodeString& icu::UnicodeString::toLower ( void  )

Convert the characters in this to lower case following the conventions of the default locale.

Returns
A reference to this.
Stable:
ICU 2.0
UnicodeString& icu::UnicodeString::toLower ( const Locale locale)

Convert the characters in this to lower case following the conventions of a specific locale.

Parameters
localeThe locale containing the conventions to use.
Returns
A reference to this.
Stable:
ICU 2.0
UnicodeString& icu::UnicodeString::toTitle ( BreakIterator titleIter)

Titlecase this string, convenience function using the default locale.

Casing is locale-dependent and context-sensitive. Titlecasing uses a break iterator to find the first characters of words that are to be titlecased. It titlecases those characters and lowercases all others.

The titlecase break iterator can be provided to customize for arbitrary styles, using rules and dictionaries beyond the standard iterators. It may be more efficient to always provide an iterator to avoid opening and closing one for each string. The standard titlecase iterator for the root locale implements the algorithm of Unicode TR 21.

This function uses only the setText(), first() and next() methods of the provided break iterator.

Parameters
titleIterA break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.
Returns
A reference to this.
Stable:
ICU 2.1
UnicodeString& icu::UnicodeString::toTitle ( BreakIterator titleIter,
const Locale locale 
)

Titlecase this string.

Casing is locale-dependent and context-sensitive. Titlecasing uses a break iterator to find the first characters of words that are to be titlecased. It titlecases those characters and lowercases all others.

The titlecase break iterator can be provided to customize for arbitrary styles, using rules and dictionaries beyond the standard iterators. It may be more efficient to always provide an iterator to avoid opening and closing one for each string. The standard titlecase iterator for the root locale implements the algorithm of Unicode TR 21.

This function uses only the setText(), first() and next() methods of the provided break iterator.

Parameters
titleIterA break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.
localeThe locale to consider.
Returns
A reference to this.
Stable:
ICU 2.1
UnicodeString& icu::UnicodeString::toTitle ( BreakIterator titleIter,
const Locale locale,
uint32_t  options 
)

Titlecase this string, with options.

Casing is locale-dependent and context-sensitive. Titlecasing uses a break iterator to find the first characters of words that are to be titlecased. It titlecases those characters and lowercases all others. (This can be modified with options.)

The titlecase break iterator can be provided to customize for arbitrary styles, using rules and dictionaries beyond the standard iterators. It may be more efficient to always provide an iterator to avoid opening and closing one for each string. The standard titlecase iterator for the root locale implements the algorithm of Unicode TR 21.

This function uses only the setText(), first() and next() methods of the provided break iterator.

Parameters
titleIterA break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.
localeThe locale to consider.
optionsOptions bit set, see ucasemap_open().
Returns
A reference to this.
See Also
U_TITLECASE_NO_LOWERCASE
U_TITLECASE_NO_BREAK_ADJUSTMENT
ucasemap_open
Stable:
ICU 3.8
UnicodeString& icu::UnicodeString::toUpper ( void  )

Convert the characters in this to UPPER CASE following the conventions of the default locale.

Returns
A reference to this.
Stable:
ICU 2.0
UnicodeString& icu::UnicodeString::toUpper ( const Locale locale)

Convert the characters in this to UPPER CASE following the conventions of a specific locale.

Parameters
localeThe locale containing the conventions to use.
Returns
A reference to this.
Stable:
ICU 2.0
int32_t icu::UnicodeString::toUTF32 ( UChar32 utf32,
int32_t  capacity,
UErrorCode errorCode 
) const

Convert the UnicodeString to UTF-32.

Unpaired surrogates are replaced with U+FFFD. Calls u_strToUTF32WithSub().

Parameters
utf32destination string buffer, can be NULL if capacity==0
capacitythe number of UChar32s available at utf32
errorCodeStandard ICU error code. Its input value must pass the U_SUCCESS() test, or else the function returns immediately. Check for U_FAILURE() on output or use with function chaining. (See User Guide for details.)
Returns
The length of the UTF-32 string.
See Also
fromUTF32
Stable:
ICU 4.2
void icu::UnicodeString::toUTF8 ( ByteSink sink) const

Convert the UnicodeString to UTF-8 and write the result to a ByteSink.

This is called by toUTF8String(). Unpaired surrogates are replaced with U+FFFD. Calls u_strToUTF8WithSub().

Parameters
sinkA ByteSink to which the UTF-8 version of the string is written. sink.Flush() is called at the end.
Stable:
ICU 4.2
See Also
toUTF8String
template<typename StringClass >
StringClass& icu::UnicodeString::toUTF8String ( StringClass &  result) const
inline

Convert the UnicodeString to UTF-8 and append the result to a standard string.

Unpaired surrogates are replaced with U+FFFD. Calls toUTF8().

Parameters
resultA standard string (or a compatible object) to which the UTF-8 version of the string is appended.
Returns
The string object.
Stable:
ICU 4.2
See Also
toUTF8

Definition at line 1683 of file unistr.h.

UnicodeString& icu::UnicodeString::trim ( void  )

Trims leading and trailing whitespace from this UnicodeString.

Returns
a reference to this
Stable:
ICU 2.0
UBool icu::UnicodeString::truncate ( int32_t  targetLength)
inline

Truncate this UnicodeString to the targetLength.

Parameters
targetLengththe desired length of this UnicodeString.
Returns
TRUE if the text was truncated, FALSE otherwise
Stable:
ICU 2.0

Definition at line 4445 of file unistr.h.

References FALSE, isBogus(), length(), and TRUE.

Referenced by retainBetween(), and icu::Transliterator::setID().

UnicodeString icu::UnicodeString::unescape ( ) const

Unescape a string of characters and return a string containing the result.

The following escape sequences are recognized:

\uhhhh 4 hex digits; h in [0-9A-Fa-f] \Uhhhhhhhh 8 hex digits \xhh 1-2 hex digits \ooo 1-3 octal digits; o in [0-7] \cX control-X; X is masked with 0x1F

as well as the standard ANSI C escapes:

\a => U+0007, \b => U+0008, \t => U+0009, \n => U+000A, \v => U+000B, \f => U+000C, \r => U+000D, \e => U+001B, \" => U+0022, \' => U+0027, \? => U+003F, \\ => U+005C

Anything else following a backslash is generically escaped. For example, "[a\-z]" returns "[a-z]".

If an escape sequence is ill-formed, this method returns an empty string. An example of an ill-formed sequence is "\\u" followed by fewer than 4 hex digits.

This function is similar to u_unescape() but not identical to it. The latter takes a source char*, so it does escape recognition and also invariant conversion.

Returns
a string with backslash escapes interpreted, or an empty string on error.
See Also
UnicodeString::unescapeAt()
u_unescape()
u_unescapeAt()
Stable:
ICU 2.0
UChar32 icu::UnicodeString::unescapeAt ( int32_t &  offset) const

Unescape a single escape sequence and return the represented character.

See unescape() for a listing of the recognized escape sequences. The character at offset-1 is assumed (without checking) to be a backslash. If the escape sequence is ill-formed, or the offset is out of range, U_SENTINEL=-1 is returned.

Parameters
offsetan input output parameter. On input, it is the offset into this string where the escape sequence is located, after the initial backslash. On output, it is advanced after the last character parsed. On error, it is not advanced at all.
Returns
the character represented by the escape sequence at offset, or U_SENTINEL=-1 on error.
See Also
UnicodeString::unescape()
u_unescape()
u_unescapeAt()
Stable:
ICU 2.0

The documentation for this class was generated from the following file: