// Copyright (c) 2006 IBM ICU Tech Meeting 02/17/2006 2PM Andy George Ram Mark Deborah Vladimir Markus Tex Doug Eric Agenda: * dictionary break iterator * wrap up on Freezable UText * ICU utf-8 usage discussion * charset recognition C port postmortem * discuss! bugs * Mark/Yoshito - globalization prefs * Mark/Flexible date/time stuff * Deborah (dictionary break iterator): - Break iterator dictionary - to have UData header or not - we want dictionary binary data to be in a bundle - where udata should go. Writing data will be a ures call. - Dictionary C++ class doesn't need to know about udata info - loading code needs to know about udata. But that code is part of break engine factory. Factory will load the dictionary and give it to the dictionary engine. * wrap up on Freezable UText - andy questions: name of the function - mark: we settled on 'freeze' & 'freezable' - break iterator already has isWritable and readOnly parameter. - UText should have isReadOnly. - Only for UText, the function should be setReadOnly. isReadOnly and readOnly as a param on clone. - Deborah's bug requested that characters could cross chunk boundaries (4873). Markus strongly disagrees, this makes a mess on macros. Motivation: data structures may not look at the contents of chunks. - Deborah to send out before and after changes for ICU code. * ICU utf-8 usage discussion - Tex: people are asking for functionality that is in ICU, but with utf-8 interface. - Andy: general solution is UText - we want to expand the list of ICU services that accept UText and UText wrappers for utf-8. What are services with high importance. - Tex: timeframe? - Andy: UText exists on break iteration. regex is a high priority. - Mark: Google has the same interest. On the C side, utf-8 is standard. - Apple: break iteration, regex - Tex: constrain is that many technologies are utf-8 based. It is not just light processing. +-+- progress on standing code contribution document. -+-+ - Deborah: Andy was planning to convert break iterator to UText. Yes, it is going to happen in near future. * Mark/Yoshito - globalization prefs - needs CLDR data * Mark/Flexible date/time stuff. Mark working on J version. Deborah working on C version. - Tex: date/time format perfomance improvements? - Deborah: share read only calendars. + perfomance agenda item for next * charset recognition C port postmortem * discuss! bugs + Tex: support for additional encodings. Big5 was added. Found a lot of big5 mislabeling. Everything is therefore made Big5 HKSCS.