ucoleitr.h File Reference

C API: UCollationElements. More...

#include "unicode/utypes.h"
#include "unicode/ucol.h"

Go to the source code of this file.

Defines

#define UCOL_NULLORDER   ((int32_t)0xFFFFFFFF)
 This indicates an error has occured during processing or if no more CEs is to be returned.
#define UCOL_PROCESSED_NULLORDER   ((int64_t)U_INT64_MAX)
 This indicates an error has occured during processing or there are no more CEs to be returned.

Typedefs

typedef struct UCollationElements UCollationElements
 The UCollationElements struct.

Functions

UCollationElementsucol_openElements (const UCollator *coll, const UChar *text, int32_t textLength, UErrorCode *status)
 Open the collation elements for a string.
int32_t ucol_keyHashCode (const uint8_t *key, int32_t length)
 get a hash code for a key.
void ucol_closeElements (UCollationElements *elems)
 Close a UCollationElements.
void ucol_reset (UCollationElements *elems)
 Reset the collation elements to their initial state.
void ucol_forceHanImplicit (UCollationElements *elems, UErrorCode *status)
 Set the collation elements to use implicit ordering for Han even if they've been tailored.
int32_t ucol_next (UCollationElements *elems, UErrorCode *status)
 Get the ordering priority of the next collation element in the text.
int32_t ucol_previous (UCollationElements *elems, UErrorCode *status)
 Get the ordering priority of the previous collation element in the text.
int64_t ucol_nextProcessed (UCollationElements *elems, int32_t *ixLow, int32_t *ixHigh, UErrorCode *status)
 Get the processed ordering priority of the next collation element in the text.
int64_t ucol_previousProcessed (UCollationElements *elems, int32_t *ixLow, int32_t *ixHigh, UErrorCode *status)
 Get the processed ordering priority of the previous collation element in the text.
int32_t ucol_getMaxExpansion (const UCollationElements *elems, int32_t order)
 Get the maximum length of any expansion sequences that end with the specified comparison order.
void ucol_setText (UCollationElements *elems, const UChar *text, int32_t textLength, UErrorCode *status)
 Set the text containing the collation elements.
int32_t ucol_getOffset (const UCollationElements *elems)
 Get the offset of the current source character.
void ucol_setOffset (UCollationElements *elems, int32_t offset, UErrorCode *status)
 Set the offset of the current source character.
int32_t ucol_primaryOrder (int32_t order)
 Get the primary order of a collation order.
int32_t ucol_secondaryOrder (int32_t order)
 Get the secondary order of a collation order.
int32_t ucol_tertiaryOrder (int32_t order)
 Get the tertiary order of a collation order.

Detailed Description

C API: UCollationElements.

The UCollationElements API is used as an iterator to walk through each character of an international string. Use the iterator to return the ordering priority of the positioned character. The ordering priority of a character, which we refer to as a key, defines how a character is collated in the given collation object. For example, consider the following in Spanish:

 .       "ca" -> the first key is key('c') and second key is key('a').
 .       "cha" -> the first key is key('ch') and second key is key('a').
 

And in German,

 .       "<ae ligature>b"-> the first key is key('a'), the second key is key('e'), and
 .       the third key is key('b').
 

Example of the iterator usage: (without error checking)

 .  void CollationElementIterator_Example()
 .  {
 .      UChar *s;
 .      t_int32 order, primaryOrder;
 .      UCollationElements *c;
 .      UCollatorOld *coll;
 .      UErrorCode success = U_ZERO_ERROR;
 .      s=(UChar*)malloc(sizeof(UChar) * (strlen("This is a test")+1) );
 .      u_uastrcpy(s, "This is a test");
 .      coll = ucol_open(NULL, &success);
 .      c = ucol_openElements(coll, str, u_strlen(str), &status);
 .      order = ucol_next(c, &success);
 .      ucol_reset(c);
 .      order = ucol_prev(c, &success);
 .      free(s);
 .      ucol_close(coll);
 .      ucol_closeElements(c);
 .  }
 

ucol_next() returns the collation order of the next. ucol_prev() returns the collation order of the previous character. The Collation Element Iterator moves only in one direction between calls to ucol_reset. That is, ucol_next() and ucol_prev can not be inter-used. Whenever ucol_prev is to be called after ucol_next() or vice versa, ucol_reset has to be called first to reset the status, shifting pointers to either the end or the start of the string. Hence at the next call of ucol_prev or ucol_next, the first or last collation order will be returned. If a change of direction is done without a ucol_reset, the result is undefined. The result of a forward iterate (ucol_next) and reversed result of the backward iterate (ucol_prev) on the same string are equivalent, if collation orders with the value UCOL_IGNORABLE are ignored. Character based on the comparison level of the collator. A collation order consists of primary order, secondary order and tertiary order. The data type of the collation order is t_int32.

See also:
UCollator

Definition in file ucoleitr.h.


Define Documentation

#define UCOL_NULLORDER   ((int32_t)0xFFFFFFFF)

This indicates an error has occured during processing or if no more CEs is to be returned.

Stable:
ICU 2.0

Definition at line 28 of file ucoleitr.h.

#define UCOL_PROCESSED_NULLORDER   ((int64_t)U_INT64_MAX)

This indicates an error has occured during processing or there are no more CEs to be returned.

Internal:
Do not use. This API is for internal use only.

Definition at line 36 of file ucoleitr.h.


Typedef Documentation

The UCollationElements struct.

For usage in C programs.

Stable:
ICU 2.0

Definition at line 45 of file ucoleitr.h.


Function Documentation

void ucol_closeElements ( UCollationElements elems  ) 

Close a UCollationElements.

Once closed, a UCollationElements may no longer be used.

Parameters:
elems The UCollationElements to close.
Stable:
ICU 2.0
void ucol_forceHanImplicit ( UCollationElements elems,
UErrorCode status 
)

Set the collation elements to use implicit ordering for Han even if they've been tailored.

This will also force Hangul syllables to be ordered by decomposing them to their component Jamo.

Parameters:
elems The UCollationElements containing the text.
status A pointer to a UErrorCode to reveive any errors.
Internal:
Do not use. This API is for internal use only.
int32_t ucol_getMaxExpansion ( const UCollationElements elems,
int32_t  order 
)

Get the maximum length of any expansion sequences that end with the specified comparison order.

This is useful for .... ?

Parameters:
elems The UCollationElements containing the text.
order A collation order returned by previous or next.
Returns:
maximum size of the expansion sequences ending with the collation element or 1 if collation element does not occur at the end of any expansion sequence
Stable:
ICU 2.0

Referenced by CollationElementIterator::getMaxExpansion().

int32_t ucol_getOffset ( const UCollationElements elems  ) 

Get the offset of the current source character.

This is an offset into the text of the character containing the current collation elements.

Parameters:
elems The UCollationElements to query.
Returns:
The offset of the current source character.
See also:
ucol_setOffset
Stable:
ICU 2.0
int32_t ucol_keyHashCode ( const uint8_t *  key,
int32_t  length 
)

get a hash code for a key.

.. Not very useful!

Parameters:
key the given key.
length the size of the key array.
Returns:
the hash code.
Stable:
ICU 2.0
int32_t ucol_next ( UCollationElements elems,
UErrorCode status 
)

Get the ordering priority of the next collation element in the text.

A single character may contain more than one collation element.

Parameters:
elems The UCollationElements containing the text.
status A pointer to an UErrorCode to receive any errors.
Returns:
The next collation elements ordering, otherwise returns NULLORDER if an error has occured or if the end of string has been reached
Stable:
ICU 2.0
int64_t ucol_nextProcessed ( UCollationElements elems,
int32_t *  ixLow,
int32_t *  ixHigh,
UErrorCode status 
)

Get the processed ordering priority of the next collation element in the text.

A single character may contain more than one collation element.

Parameters:
elems The UCollationElements containing the text.
ixLow a pointer to an int32_t to receive the iterator index before fetching the CE.
ixHigh a pointer to an int32_t to receive the iterator index after fetching the CE.
status A pointer to an UErrorCode to receive any errors.
Returns:
The next collation elements ordering, otherwise returns UCOL_PROCESSED_NULLORDER if an error has occured or if the end of string has been reached
Internal:
Do not use. This API is for internal use only.
UCollationElements* ucol_openElements ( const UCollator coll,
const UChar text,
int32_t  textLength,
UErrorCode status 
)

Open the collation elements for a string.

Parameters:
coll The collator containing the desired collation rules.
text The text to iterate over.
textLength The number of characters in text, or -1 if null-terminated
status A pointer to an UErrorCode to receive any errors.
Returns:
a struct containing collation element information
Stable:
ICU 2.0
int32_t ucol_previous ( UCollationElements elems,
UErrorCode status 
)

Get the ordering priority of the previous collation element in the text.

A single character may contain more than one collation element. Note that internally a stack is used to store buffered collation elements. It is very rare that the stack will overflow, however if such a case is encountered, the problem can be solved by increasing the size UCOL_EXPAND_CE_BUFFER_SIZE in ucol_imp.h.

Parameters:
elems The UCollationElements containing the text.
status A pointer to an UErrorCode to receive any errors. Noteably a U_BUFFER_OVERFLOW_ERROR is returned if the internal stack buffer has been exhausted.
Returns:
The previous collation elements ordering, otherwise returns NULLORDER if an error has occured or if the start of string has been reached.
Stable:
ICU 2.0
int64_t ucol_previousProcessed ( UCollationElements elems,
int32_t *  ixLow,
int32_t *  ixHigh,
UErrorCode status 
)

Get the processed ordering priority of the previous collation element in the text.

A single character may contain more than one collation element. Note that internally a stack is used to store buffered collation elements. It is very rare that the stack will overflow, however if such a case is encountered, the problem can be solved by increasing the size UCOL_EXPAND_CE_BUFFER_SIZE in ucol_imp.h.

Parameters:
elems The UCollationElements containing the text.
ixLow A pointer to an int32_t to receive the iterator index after fetching the CE
ixHigh A pointer to an int32_t to receiver the iterator index before fetching the CE
status A pointer to an UErrorCode to receive any errors. Noteably a U_BUFFER_OVERFLOW_ERROR is returned if the internal stack buffer has been exhausted.
Returns:
The previous collation elements ordering, otherwise returns UCOL_PROCESSED_NULLORDER if an error has occured or if the start of string has been reached.
Internal:
Do not use. This API is for internal use only.
int32_t ucol_primaryOrder ( int32_t  order  ) 

Get the primary order of a collation order.

Parameters:
order the collation order
Returns:
the primary order of a collation order.
Stable:
ICU 2.6
void ucol_reset ( UCollationElements elems  ) 

Reset the collation elements to their initial state.

This will move the 'cursor' to the beginning of the text. Property settings for collation will be reset to the current status.

Parameters:
elems The UCollationElements to reset.
See also:
ucol_next
ucol_previous
Stable:
ICU 2.0
int32_t ucol_secondaryOrder ( int32_t  order  ) 

Get the secondary order of a collation order.

Parameters:
order the collation order
Returns:
the secondary order of a collation order.
Stable:
ICU 2.6
void ucol_setOffset ( UCollationElements elems,
int32_t  offset,
UErrorCode status 
)

Set the offset of the current source character.

This is an offset into the text of the character to be processed. Property settings for collation will remain the same. In order to reset the iterator to the current collation property settings, the API reset() has to be called.

Parameters:
elems The UCollationElements to set.
offset The desired character offset.
status A pointer to an UErrorCode to receive any errors.
See also:
ucol_getOffset
Stable:
ICU 2.0
void ucol_setText ( UCollationElements elems,
const UChar text,
int32_t  textLength,
UErrorCode status 
)

Set the text containing the collation elements.

Property settings for collation will remain the same. In order to reset the iterator to the current collation property settings, the API reset() has to be called.

Parameters:
elems The UCollationElements to set.
text The source text containing the collation elements.
textLength The length of text, or -1 if null-terminated.
status A pointer to an UErrorCode to receive any errors.
See also:
ucol_getText
Stable:
ICU 2.0
int32_t ucol_tertiaryOrder ( int32_t  order  ) 

Get the tertiary order of a collation order.

Parameters:
order the collation order
Returns:
the tertiary order of a collation order.
Stable:
ICU 2.6
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

Generated on Sat Jan 23 15:17:38 2010 for ICU 4.3.4 by  doxygen 1.6.1