- NLS stands for National Language Support but it entails much more than just language.
Globalization Capabilities
Character Sets
- Some applications (MS Word) use the character sets provided with the OS, while Oracle provides its own character sets which means that it is the same on all platforms.
- Single-byte character sets use only one byte (8 bits or less) per character.
- Multi-byte character sets use one, two or three bytes per character.
- Fixed width character sets use the same number of bytes for each character.
- Varying width character sets use a different number of bytes per character.
- Single-byte seven-bit – uses a single byte per character but only 7 of the 8 bits per character
- 2^7 total possible characters (128)
- Usually inadequate
- Oracle default (US7ASCII)
- Single-byte eight-bit – uses a single byte per character and 8 bits per character
- 2^8 total possible characters (256)
- Adequate for Western European languages but perhaps not Eastern Europe and definitely not Asian languages (ex. WE8ISO8859P15)
- Fixed-width multibyte
- Unicode (AL16UTF16)
- Varying-width single-byte
- Characters may be from one to four bytes
- Unicode (UTF8)
- Varying-width multibyte
- May use shift-out code to indicate that the following bytes are double-byte characters and shift-in code to indicate single-byte characters.
Language Support
- The number of languages depends on platform, release & patch level.
- The view V$NLS_VALID_VALUES shows all possible languages (American, German, French, Canadian French, etc.)
- The parameters NLS_LANGUAGE, NLS_DATE_LANGUAGE & NLS_SORT will determine error message language, date language & sort order, respectively.
- SQL> select * from v$nls_valid_values where parameter=’SORT’;
- BINARY is the default value for NLS_SORT and is poor for any language besides English. It simply sorts by the ASCII numeric value of the character set.
Territory Support
- The territory selected sets a number of parameter defaults for day and week numbering, credit and debit symbols, date formats (nls_date_format), decimal and group numeric separators (nls_numeric_characters), and currency symbols (nls_currency).
Globalization Support Features
- Can be specified at any and all of five levels:
- Database
- Instance
- Client environment
- Session
- Statement
Choosing a Character Set
- The character set is used to store data in columns of type: VARCHAR2, CLOB, CHAR and LONG. Changing the character set can destroy data in these columns.
- The National Character Set is a secondary character set specified at database creation time. It is used for columns of datatype: NVARCHAR2, NCLOB and NCHAR.
- Must be either AL16UTF16 or UTF8. Other unicode sets are not supported.
- The only limitation on the database character set is that it must have either US7ASCII or EBCDIC as a subset since SQL and PL/SQL are stored in these char sets.
- The default character set is US7ASCII and the default national character set is AL16UTF16 (must be AL16UTF16 or UTF8).
Changing Character Sets
- Two tools exist:
- Database Character Set Scanner (csscan)
- csscan system/passwd full=y tochar=utf8
- checks database to see if problems may arise
- one possible problem is that the new char set may require more bytes and therefore the new data won’t fit in the column
- the output will show every row that will have problems with the new char set
- Language and Character Set File Scanner (lcsscan)
- checks text files to see what language the data is in
- must be a straight text file
- Database Character Set Scanner (csscan)
- To change Character Sets of the DB
- SQL> ALTER DATABASE CHARACTER SET …
- SQL> ALTER DATABASE NATIONAL CHARACTER SET …
- The new Character Set must be a superset of the old set but this does not guarantee no corruptions
Globalization Within the Database
- SQL> select * from nls_database_parameters;
Globalization at the Instance Level
- Instance parameter settings will override the database settings.
- NLS_INSTANCE_PARAMETERS show the settings just like NLS_DATABASE_PARAMETERS
Client-Side Environment Settings
- C:\> set nls_lang=Hungarian
- It may be dangerous for clients to set their own NLS settings.
- Conversion may not be possible: Asian -> US7ASCII
- The conversion between client-side and server-side is done by Oracle Net.
Session-Level Globalization Settings
- A user can set their NLS settings in two ways:
- SQL> alter session set nls_date_format=’dd.mm.yyyy’;
- SQL> execute dbms_session.set_nls(‘nls_date_format’,”‘dd.mm.yyyy’”);
Statement Globalization Settings
- The tightest level of control over Globalization is to manage it from within each statement.
- SQL> select to_char(hiredate,’Day dd, Month YYYY’,'nls_date_language=dutch’), to_char(hiredate, ‘Day dd, Month YYYY’,nls_date_language=german) from scott.emp;
- Statement conversion choices:
- to_date – nls_date_language, nls_calendar
- to_number – nls_numeric_characters, nls_currency, nls_dual_currency, nls_iso_currency, nls_calendar
- to_char, to_nchar – nls_date_language, nls_numeric_characters, nls_currency, nls_dual_currency, nls_iso_currency, nls_calendar
Linguistic Sorting and Selection
- Oracle’s default sort order is binary, which converts characters to their ASCII codes and sorts them accordingly. This is suitable for American english but not for other languages.
- Linguistic sorting means that Oracle will replace each character with a numeric value that represents its sorting value for the chosen language.
- All nls_sort options can be suffixed with _AI or _CI for accent insensitive or case insensitive
- SQL> alter session set nls_sort=FRENCH_CI;
The Locale Builder
- A graphical tool to create customized Globalization environments
- $ORACLE_HOME/nls/lbuilder/lbuilder
Using Time Zones
- The Oracle environment can be made time zone aware by using datatypes:
- TIMESTAMP WITH TIME ZONE – stores the timestamp and time zone it refers to
- TIMESTAMP WITH LOCAL TIME ZONE – timestamp is converted to db and converted to user’s local time zone on retrieval
- DATE and TIMESTAMP are always normalized to the database time zone and displayed unchanged when retrieved
Parameters
Language Support
- NLS_LANGUAGE – defaults to AMERICAN
- NLS_DATE_LANGUAGE – defaults to AMERICAN
- NLS_SORT – defaults to BINARY which is a poor choice for many languages
Territory Support
- NLS_TERRITORY – defaults to AMERICA
- NLS_CURRENCY – defaults to $
- NLS_DUAL_CURRENCY – defaults to $ (secondary currency for territory)
- NLS_ISO_CURRENCY – defaults to AMERICA
- NLS_DATE_FORMAT – defaults to DD-MM-RR
- NLS_NUMERIC_CHARACTERS – defaults to ., (decimal and comma)
- NLS_TIMESTAMP_FORMAT – defaults to DD-MM-RRHH.MI.SSXFF AM (for columns of type timestamp)
- NLS_TIMESTAMP_TZ_FORMAT – defaults to DD-MM-RRHH.MI.SSXFF AM TZR (for columns of type TIMESTAMP WITH LOCAL TIMEZONE)
Other NLS Settings
- NLS_CALENDAR – defaults to Gregorian
- NLS_COMP – defaults to BINARY (comparison of letters by binary value)
- NLS_LENGTH_SEMANTICS – defaults to BYTE (column length specified in bytes – not chars)
- NLS_NCHAR_CONV_EXCP – defaults to false (won’t show error messages when conversions between different types fail – VARCHAR2 -> NVARCHAR)
Views & Tables
- V$NLS_VALID_VALUES
- V$NLS_PARAMETERS – Globalization settings currently in effect for your session
- V$TIMEZONE_NAMES -list of supported time zones
- NLS_DATABASE_PARAMETERS – shows current values of all NLS settings at db level
- NLS_INSTANCE_PARAMETERS – shows current values of all NLS settings at instance level
- NLS_SESSION_PARAMETERS – shows current values of all NLS settings at session level