Description of the Kannada Language (2024)

Kannada Unicode Design Guide

Abstract: This document provides general information about theKannada language and conventions of its usage in computers. It providesinformation about the Input, Storage, Display and Printing of Kannada Characters.We strongly feel that this information gathered from various standards isnecessary for the correct usage of the language in various applications ofKannada Language Computing. It also includes the sorting sequence for Kannadain Unicode.

Note 1: This documentcontains Unicode characters and can be viewed using MS Office XP on Windows XPor equivalent

Note 2: TheConvention followed in Unicode (Version 3.0) Chapter 9 (South and SoutheastAsian Scripts) is used in this document and might differ from the notationcommonly used in the Kannada Script.

Contact Information:

Chief Investigator

Resource Centre for Indian Language TechnologySolutions- Kannada

Department of Management Studies

Indian Institute of Science

Bangalore– 560 012

Phone : 91-80-3466022 / 394 2377 (Dir)

91-80-394 2378 / 394 2567

Fax :91-80-346 6022 / 3600683 / 3600085

Email : root@iltwebserver.mgmt.iisc.ernet.in

Tableof Contents

1. History of Kannada Language 5

1.1 Description of Kannada Language 5

1.2 Brief introduction to Kannada language 6

1.2.1 Vowels 6

1.2.2 Anuswaras 6

1.2.3 Visarga 6

1.2.4 Avagraha 6

1.2.5 Consonants 6

1.2.6 Basic Language Rule in Kannada 7

2. Technical Characteristics 9

2.1Kannada Alphabet Characteristic 9

2.1.1 Consonant Letters 9

2.1.2 Independent Vowel Letters 9

2.1.3 Dependent Vowel Signs 9

2.1.4 Virama (Halant) 11

2.1.5 Consonant Conjuncts 11

2.1.6 Visarg 12

2.1.7 Avagrah 12

2.1.8 Numerals 12

2.1.9 Punctuation Marks 12

2.1.10 Ancient Signs 12

2.2 Fonts 12

2.2.1 Font developing Tools 12

2.3 Keyboard 13

2.4 Presentation andStorage Considerations 14

2.5 Rendering Rules 14

2.5.1 Dead Consonant Rule 15

2.5.2 Consonant RA Rules 15

2.5.3 Ligature Rules 16

2.6 Sorting issues inKannada 17

2.6.1 Sorting of Nuktacharacters 17

2.6.2 Sorting the data recordscontaining anuswara and visarga 17

2.6.3 Sorting of words with deadconsonants 18

2.6.4 Sorting of Conjuncts having twodifferent display forms 19

2.6.5 Sorting of Diacritic characters 19

2.6.6 Conclusion 19

3. References 20

Appendix 1: Unicode chart and the Collation chart ifdeletion and relocation are not allowed 21

Appendix 2: Unicode chart and the Collation chart ifdeletion and relocation are allowed 24

Appendix 3: Output from FontLab displaying all glyphs in the glyphset standardised by KGP 27

1. History ofKannada Language

Kannada is a south Indian languagespoken in Karnataka state of India.Kannada is originated from the DravidianLanguage. Telugu, Tamil, Malayalam are the other South Indian Languagesoriginated from Dravidian Language. Kannada and Telugu have almost the samescript. Malayalam and Tamil have resemblance. Kannada as a language hasundergone modifications since BCs. It can be classified into four types-

Purva Halegannada (from thebeginning till 10^th Century)

Halegannada (from 10^thCentury to 12^th Century)

Nadugannada (from 12^thCentury to 15^th Century)

Hosagannada (from 15^thCentury)

1.1 Description of Kannada Script

Kannada script is the visual form ofKannada language. It originated from southern Bramhi lipi of Ashoka period. It underwentmodifications periodically in the reign of Sathavahanas, Kadambas, Gangas,Rastrakutas, and Hoysalas. Even before seventh-Century, theTelugu-Kannada script was used in the inscriptions of the Kadambas of Banavasiand the early Chalukya of Badami in the west. From the middle of the seventhcentury the archaic variety of the Telugu-Kannada script developed a middlevariety. The modern Kannada and Telugu scripts emerged in the thirteenthCentury. Kannada script is also used to write Tulu, Konkani and Kodavalanguages.

Kannada along with other Indian language scripts shares a large numberof structural features. The writing system of Kannada script encompasses theprinciples governing the phonetics and a syllabic writing systems, and phonemicwriting systems (alphabets). The effective unit of writing Kannada is theorthographic syllable consisting of a consonant and vowel (CV) core andoptionally, one or more preceding consonants, with a canonical structureof ((C) C) CV. The orthographicsyllable need not correspond exactly with a phonological syllable, especiallywhen a consonant cluster is involved, but the writing system is built onphonological principles and tends to correspond quite closely to pronunciation.The orthographic syllable is built up of alphabetic pieces, the actual lettersof Kannada script. These consist of distinct character types: Consonantletters, independent vowels and the corresponding dependent vowel signs. In atext sequence, these characters are stored in logical phonetic order.

The Kannada block of Unicode Standard (0C80 to 0CFF) is based onISCII-1988 (Indian Standard Code for Information Interchange). The UnicodeStandard (Version 3) encodes Kannada characters in the same relative positionsas those coded in the ISCII-1988 standard.

1.2 Brief introduction toKannada language

1.2.1 Vowels (Swaras) Vowels are theindependently existing letters which are called Swaras. They are-

ಅ ಆ ಇ ಈ ಉ ಊ ಋ ಎ ಏ ಐ ಒ ಓ ಔ

There are two types of Swaras depending on the time used to pronounce.They are Hrasva Swara and Deerga Swara.

Hrasva Swara

A freely existing independent vowel which can be pronouncedin a single matra time (matra kala) also called as a matra. They are-

ಅ ಇ ಉ ಋ ಎ ಐ ಒ ಔ

Deergha Swara A freely existing independent vowel which can be pronouncedin two matras. They are-

ಆ ಈ ಊ ಏ ಓ

1.2.2 Anuswaras ಅಂ

1.2.3 Visarga ಅಃ

1.2.4 Avagraha Also called as Plutha, which is used for the third matraeither in a consonant or a vowel.

1.2.5 Consonants (Vyanjanas) Theseare dependent on vowels to take a independent form of the Consonant. These canbe divided into Vargeeya and Avargeeya.

Vargeeya Vyanjanas

ಕ್ ಖ್ಗ್ ಘ್ ಙ್

ಚ್ ಛ್ಜ್ ಝ್ ಞ್

ತ್ ಥ್ದ್ ಧ್ ನ್

ಟ್ ಠ್ಡ್ ಢ್ ಣ್

ಪ್ ಫ್ಬ್ ಭ್ ಮ್

Avargeeya Vyanjanas

ಯ್ ರ್ಲ್ ವ್ ಶ್ ಷ್ ಸ್ ಹ್ಳ್

1.2.6 Basic Language Rule in Kannada

When adependent consonant combines with an independent vowel, aAkshara isformed.

Consonant (Vyanjana) + Vowel (matra) ---> Letter (Akshara)

Example: ಕ್ + ಅ ---> ಕ

Based onthis rule we can combine all the Consonants (Vyanjanas) with the existing Vowels(matra)

to form Kagunitha for Kannada alphabet.

ಕ ಕಾಕಿ ಕೀ ಕು ಕೂ ಕೃ ಕೆಕೇ ಕೈ ಕೊ ಕೋ ಕೌ ಕಂಕಃ

ಖ ಖಾಖಿ ಖೀ ಖು ಖೂ ಖೃ ಖೆಖೇ ಖೈ ಖೊ ಖೋ ಖೌ ಖಂಖಃ

ಗ ಗಾಗಿ ಗೀ ಗು ಗೂ ಗೃ ಗೆಗೇ ಗೈ ಗೊ ಗೋ ಗೌ ಗಂಗಃ

ಘ ಘಾಘಿ ಘೀ ಘೃ ಘೆ ಘೇ ಘೈಘೊ ಘೊ ಘೋ ಘೌ ಘಂ ಘಃ

ಙ ಙಾಙಿ ಙೀ ಙು ಙೂ ಙೃ ಙೆಙೇ ಙೈ ಙೊ ಙೋ ಙೌ ಙಂಙಃ

ಚ ಚಾಚಿ ಚೀ ಚು ಚೂ ಚೃ ಚೆಚೇ ಚೈ ಚೊ ಚೋ ಚೌ ಚಂಚಃ

ಛಾ ಛಿಛೀ ಛು ಛೂ ಛೃ ಛೆ ಛೇಛೈ ಛೊ ಛೋ ಛೌ ಛಂ ಛಃ

ಜ ಜಾಜಿ ಜೀ ಜು ಜೂ ಜೃ ಜೆಜೇ ಜೈ ಜೊ ಜೋ ಜೌ ಜಂಜಃ

ಝ ಝಾಝಿ ಝೀ ಝು ಝೂ ಝೃ ಝೆಝೇ ಝೈ ಝೊ ಝೋ ಝೌ ಝಂಝಃ

ಞ ಞಾಞಿ ಞೀ ಞು ಞೂ ಞೃ ಞೆಞೇ ಞೈ ಞೊ ಞೋ ಞೌ ಞಂಞಃ

ತ ತಾತಿ ತೀ ತು ತೂ ತೃ ತೆತೇ ತೈ ತೊ ತೋ ತೌ ತಂತಃ

ಥ ಥಾಥಿ ಥೀ ಥು ಥೂ ಥೃ ಥೆಥೇ ಥೈ ಥೊ ಥೋ ಥೌ ಥಂಥಃ

ದ ದಾದಿ ದೀ ದು ದೂ ದೃ ದೆದೇ ದೈ ದೊ ದೋ ದೌ ದಂದಃ

ಧ ಧಾಧಿ ಧೀ ಧು ಧೂ ಧೃ ಧೆಧೇ ಧೈ ಧೊ ಧೋ ಧೌ ಧಂಧಃ

ನ ನಾನಿ ನೀ ನು ನೂ ನೃ ನೆನೇ ನೈ ನೊ ನೋ ನೌ ನಂನಃ

ಟ ಟಾಟಿ ಟೀ ಟು ಟೂ ಟೃ ಟೆಟೇ ಟೈ ಟೊ ಟೋ ಟೌ ಟಂಟಃ

ಠ ಠಾಠಿ ಠೀ ಠು ಠೂ ಠೃ ಠೆಠೇ ಠೈ ಠೊ ಠೋ ಠೌ ಠಂಠಃ

ಡ ಡಾಡಿ ಡೀ ಡು ಡೂ ಡೃ ಡೆಡೇ ಡೈ ಡೊ ಡೋ ಡೌ ಡಂಡಃ

ಡ ಢಾಢಿ ಢೀ ಢು ಢೂ ಢೃ ಢೆಢೇ ಢೈ ಢೊ ಢೋ ಢೌ ಢಂಢಃ

ಣ ಣಾಣಿ ಣೀ ಣು ಣೂ ಣೃ ಣೆಣೇ ಣೈ ಣೊ ಣೋ ಣೌ ಣಂಣಃ

ಪ ಪಾಪಿ ಪೀ ಪು ಪೂ ಪೃ ಪೆಪೇ ಪೈ ಪೊ ಪೋ ಪೌ ಪಂಪಃ

ಫ ಫಾಫಿ ಫೀ ಫು ಫೂ ಫೃ ಫೆಫೇ ಫೈ ಫೊ ಫೋ ಫೌ ಫಂಫಃ

ಬ ಬಾಬಿ ಬೀ ಬು ಬೂ ಬೃ ಬೆಬೇ ಬೈ ಬೊ ಬೋ ಬೌ ಬಂಬಃ

ಭ ಭಾಭಿ ಭೀ ಭು ಭೂ ಭೃ ಭೆಭೇ ಭೈ ಭೊ ಭೋ ಭೌ ಭಂಭಃ

ಮ ಮಾಮಿ ಮೀ ಮು ಮೂ ಮೃ ಮೆಮೇ ಮೈ ಮೊ ಮೋ ಮೌ ಮಂಮಃ

ಯ ಯಾಯಿ ಯೀ ಯು ಯೂ ಯೃ ಯೆಯೇ ಯೈ ಯೊ ಯೋ ಯೌ ಯಂಯಃ

ರ ರಾರಿ ರೀ ರು ರೂ ರೃ ರೆರೇ ರೈ ರೊ ರೋ ರೌ ರಂರಃ

ಲ ಲಾಲಿ ಲೀ ಲು ಲೂ ಲೃ ಲೆಲೇ ಲೈ ಲೊ ಲೋ ಲೌ ಲಂಲಃ

ವ ವಾವಿ ವೀ ವು ವೂ ವೃ ವೆವೇ ವೈ ವೊ ವೋ ವೌ ವಂವಃ

ಶ ಶಾಶಿ ಶೀ ಶು ಶೂ ಶೃ ಶೆಶೇ ಶೈ ಶೊ ಶೋ ಶೌ ಶಂಶಃ

ಷ ಷಾಷಿ ಷೀ ಷು ಷೂ ಷೃ ಷೆಷೇ ಷೈ ಷೊ ಷೋ ಷೌ

ಸ ಸಾಸಿ ಸೀ ಸು ಸೂ ಸೃ ಸೆಸೇ ಸೈ ಸೊ ಸೋ ಸೌ ಸಂಸಃ

ಹ ಹಾಹಿ ಹೀ ಹು ಹೂ ಹೃ ಹೆಹೇ ಹೈ ಹೊ ಹೋ ಹೌ ಹಂಹಃ

ಳ ಳಾಳಿ ಳೀ ಳು ಳೂ ಳೃ ಳೆಳೇ ಳೈ ಳೊ ಳೋ ಳೌ ಳಂಳಃ

2. Technical Characteristics

Note: TheConvention followed from this section of the document is same as the UnicodeChapter 9 (South and Southeast Asian Scripts) and might not be grammaticallycorrect.

2.1 Kannada Alphabet Characteristic

2.1.2 ConsonantLetters

Each of the consonant represents a singleconsonantal sound but also has the peculiarity of having inherent vowel,generally the short vowel ಅ (U+0C85). Thus, U+0C95 Kannadaletter KA represents not just K (ಕ್) but KA (ಕ). In the presence ofthe dependent vowel, however, the inherent vowel associated with a consonantletter is overridden by the dependent vowel. The different Consonants inKannada are:

ಕ ಖ ಗ ಘ ಙ

ಚ ಛ ಜ ಝ ಞ

ತ ಥ ದ ಧ ನ

ಟ ಠ ಡ ಢ ಣ

ಪ ಫ ಬ ಭ ಮ

ಯ ರ ಲ ವ

ಶ ಷ ಸ ಹ ಳ

2.1.3 Dependent Vowel Signs (Matras)

The dependent vowels, also known as Swarasin Kannada, serve as the common manner of writing non-inherent vowels and aregenerally referred to as Swara Chinhas in Kannada or Matras in Sanskrit.The dependent vowels do not appear stand-alone; rather, they are visiblydepicted in combination with a base-letter form (generally a consonant). Asingle consonant or a consonant cluster may have a dependent vowel applied toit to indicate the vowel quality of the syllable, when it is different from theinherent vowel. Explicit appearance of a dependent vowel in a syllableoverrides the inherent vowel ಆ (U +0C85) of a single consonantletter.

There are several variations with which thedependent vowels are applied to the base letterforms. Most of them appear asnon-spacing dependent vowels signs when applied to base letterforms; above tothe right side of a consonant letter or a consonant cluster. The following arethe exceptions and variations for the above rule:

<![if !supportLists]>·<![endif]>The two dependentvowel signs (U+0CCC3 & U+0CC4) appear one level below and to the right ofthe consonant or the consonant cluster, separated by a small white space.

<![if !supportLists]>·<![endif]>Each of the fivedependent vowels (U+0CC0, U+0CC7, U+0CC8, U+0CCA & U+0CCB) aredepicted by two or three glyphcomponents (two part or three part vowel signs ) with one component appearingwith a space to the right of the consonant or the consonant cluster.

i) In the case o f three the above-mentionedtwo/three-part dependent vowels (at U+0CC0, U+0CC7, U+0CCB), the non-spacingcomponents of each of them is (are) the same as the vowel sign(s) of thecorresponding preceding short vowels. The spacing component for each of thesedependent vowels is the same ‘length mark U+0CD5 given in Unicode version 3.The logic for this is that these dependent vowels are nothing but the longforms (independent and phonetically distinct) of the preceding short vowels.

ii) The first component of thedependent vowel (U+0CC8) mentioned above, is the same as the dependent vowel (ೆ, U+0CC6) with the second component (U+0CD6) defined independently inUnicode version 3. The second part appears slightly below and to the right ofthe consonant or the consonant clusters.

<![if !supportLists]>·<![endif]>In view of this, it isimportant to note that the two glyphs (the length mark and the second componentof ೈ) represent with the codes at U+0CD5 and U+0CD6 in Unicode version 3 have noindependent existence and do not play any part as independent codes in thecollation algorithm.

<![if !supportLists]>·<![endif]>Unlike Devanagari, theKannada script does not have any character with a left-side dependent vowelsign.

<![if !supportLists]>·<![endif]>A one-to-onecorrespondence exists between independent vowels and dependent vowel signs.

The Matras are-

ಾ ಿ ೀು ೂ ೃ ೆ ೇ ೈ ೊ ೋ ೌ ಂಃ

2.1.4 Virama (Halant)

Like Devanagari, Kannada script also employs a signknown as Halant or vowel omission sign. A halant sign (್, U+0CCD) nominally servesto cancel (or kill) the inherent vowel of the consonant to which it is applied.It functions as a combining character. When a consonant has lost its inherentvowel by the application of halant, it is known as a dead consonant. Thedead consonants are the presentation forms used to depict the consonantswithout an inherent vowel. Their rendered forms in Kannada resemble the fullconsonant with vertical stem replaced by the halant sign, which marks acharacter core. The stem glyph (U+0CBB) is graphically and historically relatedto the sign denoting the inherent /a/ (ಅ) vowel (U+0C85). In contrast, a live consonant isa consonant that retains its inherent vowel or is written with an explicitdependent vowel sign. The dead consonant is defined as a sequence consisting ofa consonant letter followed by a halant. The default rendering for adead consonant is to position the halant as a combining mark bound tothe consonant letterform. The Halant in Kannada is ್

2.1.5 Consonant Conjuncts

Like any other Indian script, Kannada is also notedfor a large number of consonant conjunct forms that serve as orthographicabbreviations (ligatures) of two or more adjacent forms. This abbreviationtakes place only in the context of a consonant cluster. An orthographic consonantcluster is defined as a sequence of characters that represent one or more deadconsonants (denoted by C_d) followed by a normal live consonant(denoted by C_l).

Corresponding to each Kannada consonant, there exists a separate and unique glyph, which is specially usedto represent the corresponding consonant in a consonant cluster. Most of theseconjunct consonant glyphs resemble their original consonant forms (many withoutthe implicit vowel sign, wherever applicable).

In Kannada,there is only one type of conjunct formation (consonant cluster) and it isdepicted as follows:

<![if !supportLists]>·<![endif]>The first consonant ofthe consonant cluster is rendered with the implicit or a different dependantvowel appearing as the terminal element of the consonant cluster.

<![if !supportLists]>·<![endif]>The remaining consonants(consonants in between the first consonant and the terminal vowel element)appear in conjunct consonant glyph forms in the phonetic order. They aregenerally depicted directly below or sometimes below but to the right of thefirst consonant.

Thus, the systematically designed Kannada scriptfont contains the conjunct glyph components, but they are not encoded asUnicode characters, because they are the resultant of ligation of distinctletters. Kannada script rendering software must be able to map appropriatecombinations of characters in context to the appropriate conjunct glyphs infonts.

2.1.6 Visarg

Comes after a vowel sound and represents a soundsimilar to /h/.

2.1.7 Avagraha

Avagraha sign is a spacing mark used while renderingSanskrit text. This is located at U+0CBD.

2.1.8 Numerals

Kannada numerals are located from U+0CE6 to U+0CEF

2.1.9 Punctuation Marks

All Punctuation mark inKannada is borrowed from English.

These characters are not included in the range forKannada in the Unicode Character set.

2.1.10 Ancient Signs

Some of he Halegannada characters are placed atU+0C8C, U+0CB1, U+0CE1.

2.2Fonts

There are a number of TrueTypefonts available for Kannada among which some of them follow an encodingstandard (like ISCII) and others do not follow any encoding standard and istied to a proprietary encoding. The Kannada Ganaka Parishat has standardisedthe glyph set to be used by all the Software that support Kannada. Annexure-1displays the glyphs standardised by KGP. Microsoft has released an OpenTypefont (with TrueType outlines) for Kannada – “Tunga.ttf” that follows Unicode as its encodingstandard.

2.2.1 Font Developing Tools

The OpenTypefont format is an extension of the TrueType font format, adding support forPostScript font data. The following tools can be used for designing of theOpenType Font.

FontLab version 4.0 (A trial version for Windows is available for download at http://www.fontlab.com/html/fontlab.html#downloads)
Font Creator Program version 3.0 (A trial version is available for download at http://www.high-logic.com/download.htm)
Fontographer version 4.1 (No trial version is available. For more information - http://www.macromedia.com/software/fontographer/)
Microsoft Visual OpenType Layout Tool "VOLT" provides an easy-to-use graphical user interface to add OpenType layout tables to fonts with TrueType outlines. It is licensed free and can be downloaded from the online community http://communities.msn.com/MicrosoftVOLTuserscommunity

2.3 Keyboard

Inputting Kannada or any other Indian languageneeds Keyboard driver / Input Method, which is a Software Component thatinterprets user operations such as typing keys. There are many KeyboardDrivers/Input Methods available in the market for Windows Operating Systemslike Baraha, Sreelipi, Akruthi, Kalitha etc. They follow different encodingmethods (glyph codes) and support different keyboard layouts like INSCRIPT,English Phonetic, Typewriter 1, and Typewriter 2.

Microsoft supports Input Methods for nine Indianlanguages (including Kannada) in Office XP on Windows XP with INSCRIPT keyboardlayout, which is common to all Indian languages and uses Unicode as theencoding Standard.

Government of Karnataka (Kannada Ganaka Parishat)has proposed a Standard Keyboard layout for Kannada. In this layout only 26keys which are painted with English characters on a Keyboard can be used torepresent 51 basic alphabets and special symbols in Kannada (13 swaras, 34consonants and 4 special symbols). This is possible as each key has a dualfunction of representing the small case (normal key) and Capital case (Shiftkey) letters in English, as shown in the figure 2.1

<![if !vml]><![endif]>

Fig. 2.1: Keyboard Layout proposedby Kannada Ganaka Parishath.

Since the 51 keys havebeen used while the keyboard provides 52 possibilities, that option (key X isnot assigned) can be used to represent foreign sounds such as combination ofNukta and ಫ. Besides, there is a need to represent the old ಱ. The combination of X with consonant ರ yields ಱ.

2.4 Presentation and StorageConsiderations

The order for storage of plain text in Kannadagenerally follows the phonetic order, that is, a CV syllable with a dependantvowel is always encoded as a consonant letter C followed by a vowel sign V inthe memory representation. This order is employed by the ISCII standard andcorresponds with phonetic and keying order of textual data. Unlike Devanagariand some other Indian scripts, all the dependent vowels in Kannada are depictedto the right of their consonant letters. Hence there is no need to reorder theelements in mapping from the logical (character) store to the presentation(glyph) rendering and vice versa.

Character order Glyphorder

KA+I KA + I

ಕ + ಇ ಕಿ

Further, Kannada script does notallow half-consonants, ligatures and half ligature forms.

2.5 Rendering Rules

(Based on Microsoft Uniscribe-OpenType implementation of the UNICODERendering Rules)

2.5.1 Notation. In the next set of rules, the following notation applies:

C_n Nominal glyph form of aconsonant C as it appears in the code charts.

C_l A live consonant, depictedidentically to C_n.

C_d Glyph depictingthe dead consonant form of a consonant C.

C_h Glyph depictingthe half-consonant form of a consonant C.

L_n Nominalglyph form of a conjunct ligature consisting of two or more componentconsonants. A conjunct ligature composed of twoconsonants X and Y is also denoted by X.Y_n.

RA_sub A non-spacing combining markglyph form positioned below the base glyph form.

V_vs Glyph depicting the dependentvowel sign form of a vowel V.

Virama_n The nominal glyph form non-spacingcombining mark depicting U+0CCD Kannada sign Virama.

A virama character is not always depicted; when itis depicted, it adopts this non-spacing mark form.

2.5.1 Dead ConsonantRule. The following rule logically precedes the application of any other rule toform a dead consonant. Once formed, a dead consonant may be subject to otherrules described next.

R0: When a consonant C_n precedes a VIRAMA_n, it isconsidered to be a dead consonant C_d. A consonant C_n thatdoes not precede VIRAMA_n is considered to be live consonant C_l.

<![if !vml]><![endif]> TA_n+ VIRAMA_n TA_d

<![if !vml]><![endif]> ತ + ್ ತ್

(Based onMicrosoft Uniscribe-OpenType implementation of the UNICODE Rendering Rules)

2.5.2 Consonant RA Rules:

R1: If the dead consonant RA_d precedes either a consonant oran independent vowel, then it is replaced by the postscript ARKAVATTU,which is positioned so that it applies to the logically subsequent element inthe memory representation.

<![if !vml]><![endif]>RA_d + KA_l KA_l+ ARKAVATTU Displayed Output

<![if !vml]><![endif]> ರ್ + ಕ ಕ+ARKAVATTU ರ್ಕ

R2: Except for the dead consonant RA_d, when a dead consonant C_dprecedes the live consonant RA_l, then C_d isreplaced with its nominal form C_n ,and RA is replaced by the subscript non-spacing mark RA_sub, which is positioned so that it applies to C_n.

<![if !vml]><![endif]> THA_d + RA_l THA_n + RA_sub Displayed Output

<![if !vml]><![endif]><![if !vml]><![endif]>ಠ್ + ರ ಠ + RA_sub ಠ್ರ

R3: If a dead consonant (other than RA_d) precedes RA_d,then the substitution of RA for RA_sub is performed asdescribed above; however, the VIRAMA that formed RA_d remainsso as to form a dead consonant conjunct form. A dead consonant conjunct formthat contains an absorbed RA_d may subsequently combine toform a multipart conjunct form.

<![if !vml]><![endif]><![if !vml]><![endif]>TA_d + RA_d TA_n + RA_sub+VIRAMA_nT.RA_d

<![if !vml]><![endif]><![if !vml]><![endif]>ತ್ + ರ್ ತ + RA_sub + ್ ತ್ರ್

A dead consonant conjunct form that contains anabsorbed RA_d may subsequently combine to form a multipart conjunctform.

<![if !vml]><![endif]>T.RA_d+ YA_l T.R.YA_n

<![if !vml]><![endif]>ತ್ರ್ + ಯ ತ್ರ್ಯ

Ligature Rules: Subsequent to the application of the rulesjust described, a set of rules governing ligature formation apply. The preciseapplication of these rules depends on the availability of glyphs in the currentfont(s) being used to display text.

R4: If a dead consonant immediately precedes another dead consonant or a liveconsonant, then the first dead consonant may join the subsequent element toform a twopart conjunct ligature form.

<![if !vml]><![endif]> JA_d+ NYA_l J.NYA_n

<![if !vml]><![endif]> ಜ್ + ಞ ಜ್ಞ

R5: A conjunct ligature form can itself behave as a dead consonant and enterinto further, more complex ligatures.

<![if !vml]><![endif]>SAd + TAd+ KAl S.T.KAn

<![if !vml]><![endif]>ಸ್ + ತ್ + ಕ ಸ್ತ್ಕ

R6: If a nominal consonant or conjunct ligature form precedes RA_subas a result of the application of rule R2, then the consonant orligature form may join with RA_sub to form a multipartconjunct ligature (see rule R2 for more information).

<![if !vml]><![endif]>KA_l+ RA_dK.RA_n

<![if !vml]><![endif]>ಕ + ರ್ಕ್ರ

R7: In some cases, other combining marks will also combine with a baseconsonant, either attaching at a nonstandard location or changing shape. Inminimal rendering there are only two cases, RA_l with U_vsor UU_vs .

<![if !vml]><![endif]><![if !vml]><![endif]>RA_l + U_vs RU_n RA_l + UU_vs RUU_n

<![if !vml]><![endif]><![if !vml]><![endif]>ರ + ು ರು ರ + ೂ ರೂ

R8: When the dependent vowel I_vs is used to override theinherent vowel of a syllable, it is always written to the extreme left of theorthographic syllable. If the orthographic syllable contains a consonantcluster, then this vowel is always depicted to the left of that cluster.

<![if !vml]><![endif]>THA_d +RA_l + I_vs T.RI_n

<![if !vml]><![endif]>ತ್+ ರ + ಿತ್ರಿ

2.6 Sorting issues in Kannada

The sorting sequence for Kannada Unicode is as perthe collation chart given as an annexure. However, the following are someimportant issues, which have to be addressed separately for proper sorting ofdata in Kannada.

ISCII – 91provides direct sorting through its codes. It is the natural sorting method justbased on code values. There are no special algorithms for language specificissues for sorting the data. This results in non-conventional sorting in somespecific cases. The scholars in Kannada have specified the sorting standards inKannada. These standards are being followed in all dictionaries and otherdocuments in Kannada. With this in view, the following four special cases havebeen identified.

2.6.1 Sorting of Nukta characters

Themodifying mark or Nukta located at U+0CBC and included in the collationtable is enough to take care of the sorting issues of characters ಜ + (U+0CBC) (modified ಜ) and ಫ + (U+0CBC) (modified ಫ). It also takes careof any other consonant, which may be modified using Nukta.

2.6.2 Sorting the data records containing anuswara and visarga

In case of sorting a data set containing wordsterminating with anuswara, visarga together with other words,words without terminating dependent vowels are placed in wrong positions.

Sorting sequence as per the Unicode is according to the specified standards if the anuswara and visarga appear within a word.

2.6.3 Sorting of words with dead consonants

Sorting of words terminating with dead consonants

Sorting in this case also violatesthe sorting rules of Kannada. The Unicode sorting places the word terminatingwith the dead consonant at the end of the list. The following list compares thesorting of a sample data using Unicode table and the acceptable sorting forthis case.

Sorted data as per Unicode	Acceptable sorting
ರಾಕ	ರಾಕ್
ರಾಕ್	ರಾಕ
ರಾಗ	ರಾಗ್
ರಾಗೋ	ರಾಗ
ರಾಗ್	ರಾಗೋ

Dead consonants within words

Proper sorting of data with suchwords can be achieved by using the invisible zero width consonant just afterthe dead consonant.

To circumvent unacceptable situations mentioned in2.6.2 and 2.6.3 above, the Unicode Standard character U+200C (Zero widthNon-Joiner) can be used appropriately in the preprocessor and collationalgorithms.

2.6.4 Sorting of Conjuncts having two different display forms

Two suchconjuncts are rendered in Kannada at present.

<![if !supportLists]>·<![endif]>Conjuncts with (U+0CB0) as the first consonant

This has been explained at an earlier section asConsonant RA rules (section 2.5.2)

Words containing both the display forms of the sameconsonant cluster with ರ (U+0CB0) as the first consonantof the cluster had to be sorted as follows. Even though the display renderingare different, both are identical in all respects. It is therefore natural thatthey should appear at consecutive positions. Even though a separate glyph and acorresponding glyph code are present in the display/storage codes such anarrangement in Unicode will not render for proper sorting.

The onlyalternative is to represent both the display forms by the same set of codeswith a distinguishing code (U+0CF5) within the string for the second displayform. In Unicode form, the distinguishing code value within the string of theconsonant sorting. This can be achieved through preprocessing software, withspecific functions to generate proper glyph codes, storage codes and Unicode atdifferent levels. Such a situation–specific code representation guaranteesproper sorting of data containing consonant clusters with two different displayforms by ignoring the code U+0CF5 for . This condition has to be incorporatedat the appropriate place in the sorting algorithm.

<![if !supportLists]>·<![endif]>The second case ofrendering a same character in two different display forms is the dead consonantನ್. It is also writtenin a second form as U+0CF5. Sortingissue in regards to this case is also dealt with the same way as in theprevious case.

The zero width Non-Joiner at U+200C cannot be usedinstead of (U+0CF5),as the same sequenceof characters appear both with Zero width Non-Joiner and with U+0CF5, the twosequences representing two different syllables (conjuncts).

2.6.5 Sorting of Diacritic characters

Diacriticcharacters formed using symbols located at 0CD1, 0CD2, 0CD3, 0CD4 and 0CF9 torender accents to consonants, are considered to be equivalent to the correspondingconsonants for sorting purposes and hence the above procedure can be adopted insuch cases also.

2.6.6 Conclusion

The sorting issues mentioned above may havemultiple solutions. Similar issues might have been solved by different methodsin respect of other Indian languages. Hence, it is desirable to evolve uniformprocedures for issues common to all Indian languages. However, solutions forsorting problems mentioned here with respect Kannada have been obtained byconsidering all consonants from U+0C95 to U+0CB9 and the consonant U+0CDE whenthey appear independently in a data field as pure consonants (i.e. as two partcoded[Ex:0C95=(0C95,0CBB)]). The sorting of a data field is achieved by theindexing method. All these can be elaborated to give the actual algorithms andfloe charts, if need be.

References:

<![if !supportLists]>1. <![endif]>“South and South East Asian Scripts”, Chapter 9 of TheUnicode Standard (Version 3.0), http://www.unicode.org/

<![if !supportLists]>2. <![endif]>“Creating and Supporting Open Type Fonts for IndicScripts”, http://www.microsoft.com/typography

<![if !supportLists]>3. <![endif]>“Unicode for Kannada Script” (written by Dr. CV SrinathaSastry), Directorate of Information Technology, Government of Karnataka

<![if !supportLists]>4. <![endif]>“User-Friendly Keyboard Layout for Kannada”, KannadaGanaka Parishat

<![if !supportLists]>5. <![endif]>“Standards for Kannada in Computers prescribed by theGovernment of Karnataka”, Kannada Ganaka Parishat.

Appendix – 1: Unicode Chart and Collation if the suggested deletion and relocation of charactersare not allowed

Below figure shows the Unicode Chart for Kannada ifdeletion and relocation of Characters are not allowed

	0C8	0C9	0CA	0CB	0CC	0CD	0CE	0CF
	▓▓	ಐ	ಠ	ರ	ೀ	▓▓	ೠ	▓▓
1	▓▓	▓▓	ಡ	ಱ	ು	<![if !vml]><![endif]>	ೡ	▓▓
2	ಂ	ಒ	ಢ	ಲ	ೂ	<![if !vml]><![endif]>	▓▓	▓▓
3	ಃ	ಓ	ಣ	ಳ	ೃ	<![if !vml]><![endif]>	▓▓	▓▓
4	▓▓	ಔ	ತ	▓▓		▓▓	▓▓	▓▓
5	ಅ	ಕ	ಥ	ವ	▓▓	ೕ	▓▓	<![if !vml]><![endif]>
6	ಆ	ಖ	ದ	ಶ	ೆ	ೖ		▓▓
7	ಇ	ಗ	ಧ	ಷ	ೇ	▓▓	೧	▓▓
8	ಈ	ಘ	ನ	ಸ	ೈ	▓▓	೨	▓▓
9	ಉ	ಙ	▓▓	ಹ	▓▓	▓▓	೩	<![if !vml]><![endif]>
A	ಊ	ಚ	ಪ		ೊ	▓▓	೪	▓▓
B	ಋ	ಛ	ಫ	<![if !vml]><![endif]>	ೋ	▓▓	೫	▓▓
C	<![if !vml]><![endif]>	ಜ	ಬ	<![if !vml]><![endif]>	ೌ	▓▓	೬	▓▓
D	▓▓	ಝ	ಭ	<![if !vml]><![endif]>	್	▓▓	೭	▓▓
E	ಎ	ಞ	ಮ	ಾ	▓▓	<![if !vml]><![endif]>	೮	▓▓
F	ಏ	ಟ	ಯ	ಿ	▓▓	▓▓	೯	▓▓

Below figure shows the Collating sequence of Kannada Unicode characters, ifadditions and relocations are not allowed. The sequence is column wise, top to bottom

Column 1

Column 2

Column 3

Column 4

Column5

0C82

ಂ

0CCD

್

0C96

ಖ

0CA6

ದ

0CB9

ಹ

0C83

ಃ

0CBB

<![if !vml]><![endif]>

0C97

ಗ

0CA7

ಧ

0CB3

ಳ

0C85

ಅ

0CBE

ಾ

0C98

ಘ

0CA8

ನ

0CB4

<![if !vml]><![endif]>

0C86

ಆ

0CBF

ಿ

0C99

ಙ

0CAA

ಪ

0CBC

<![if !vml]><![endif]>

0C87

ಇ

0CC0

ೀ

0C9A

ಚ

0CAB

ಫ

0C88

ಈ

0CC1

ು

0C9B

ಛ

0CAC

ಬ

0C89

ಉ

0CC2

ೂ

0C9C

ಜ

0CAD

ಭ

0C8A

ಊ

0CC3

ೃ

0C9D

ಝ

0CAE

ಮ

0C8B

ಋ

0CC4

ೄ

0C9E

ಞ

0CAF

ಯ

0CE0

ೠ

0CC6

ೆ

0C9F

ಟ

0CB0

ರ

0C8E

ಎ

0CC7

ೇ

0CA0

ಠ

0CB1

ಱ

0C8F

ಏ

0CC8

ೈ

0CA1

ಡ

0CB2

ಲ

0C90

ಐ

0CCA

ೊ

0CA2

ಢ

0CB5

ವ

0C92

ಒ

0CCB

ೋ

0CA3

ಣ

0CB6

ಶ

0C93

ಓ

0CCC

ೌ

0CA4

ತ

0CB7

ಷ

0C94

ಔ

0C95

ಕ

0CA5

ಥ

0CB8

ಸ

Appendix – 2: Unicode Chart andCollation if the suggested deletion and relocation of characters are allowed

Belowfigure shows the Unicode Chart for Kannada if deletion and relocation ofCharacters are allowed

	0C8	0C9	0CA	0CB	0CC	0CD	0CE	0CF
	▓▓	ಐ	ಠ	ರ	ೀ	▓▓	ೠ	▓▓
1	▓▓	▓▓	ಡ	ಱ	ು	<![if !vml]><![endif]>	▓▓	▓▓
2	ಂ	ಒ	ಢ	ಲ	ೂ	<![if !vml]><![endif]>	▓▓	▓▓
3	ಃ	ಓ	ಣ	ಳ	ೃ	<![if !vml]><![endif]>	▓▓	▓▓
4	▓▓	ಔ	ತ	▓▓		▓▓	▓▓	▓▓
5	ಅ	ಕ	ಥ	ವ	▓▓	▓▓	▓▓	<![if !vml]><![endif]>
6	ಆ	ಖ	ದ	ಶ	ೆ	▓▓		▓▓
7	ಇ	ಗ	ಧ	ಷ	ೇ	▓▓	೧	▓▓
8	ಈ	ಘ	ನ	ಸ	ೈ	▓▓	೨	▓▓
9	ಉ	ಙ	▓▓	ಹ	▓▓	▓▓	೩	<![if !vml]><![endif]>
A	ಊ	ಚ	ಪ		ೊ	▓▓	೪	▓▓
B	ಋ	ಛ	ಫ	<![if !vml]><![endif]>	ೋ	▓▓	೫	▓▓
C	▓▓	ಜ	ಬ	<![if !vml]><![endif]>	ೌ	▓▓	೬	▓▓
D	▓▓	ಝ	ಭ	<![if !vml]><![endif]>	್	▓▓	೭	▓▓
E	ಎ	ಞ	ಮ	ಾ	▓▓	▓▓	೮	▓▓
F	ಏ	ಟ	ಯ	ಿ	▓▓	▓▓	೯	▓▓

Below figure shows the Collating sequence ofKannada Unicode characters, if additions and relocations are allowed. Thesequence is column wise, top to bottom.

Column 1

Column 2

Column 3

Column 4

Column5

0C82

ಂ

0CCD

್

0C96

ಖ

0CA6

ದ

0CB9

ಹ

0C83

ಃ

0CBB

<![if !vml]><![endif]>

0C97

ಗ

0CA7

ಧ

0CB3

ಳ

0C85

ಅ

0CBE

ಾ

0C98

ಘ

0CA8

ನ

0CB4

<![if !vml]><![endif]>

0C86

ಆ

0CBF

ಿ

0C99

ಙ

0CAA

ಪ

0CBC

<![if !vml]><![endif]>

0C87

ಇ

0CC0

ೀ

0C9A

ಚ

0CAB

ಫ

0C88

ಈ

0CC1

ು

0C9B

ಛ

0CAC

ಬ

0C89

ಉ

0CC2

ೂ

0C9C

ಜ

0CAD

ಭ

0C8A

ಊ

0CC3

ೃ

0C9D

ಝ

0CAE

ಮ

0C8B

ಋ

0CC4

ೄ

0C9E

ಞ

0CAF

ಯ

0CE0

ೠ

0CC6

ೆ

0C9F

ಟ

0CB0

ರ

0C8E

ಎ

0CC7

ೇ

0CA0

ಠ

0CB1

ಱ

0C8F

ಏ

0CC8

ೈ

0CA1

ಡ

0CB2

ಲ

0C90

ಐ

0CCA

ೊ

0CA2

ಢ

0CB5

ವ

0C92

ಒ

0CCB

ೋ

0CA3

ಣ

0CB6

ಶ

0C93

ಓ

0CCC

ೌ

0CA4

ತ

0CB7

ಷ

0C94

ಔ

0C95

ಕ

0CA5

ಥ

0CB8

ಸ

Appendix – 3: Output from FontLab displaying all glyphs in the glyph set standardised byKGP

<![if !vml]><![endif]>

FAQs

Description of the Kannada Language? ›

Kannada is a highly inflected language with three genders (masculine, feminine, and neuter or common) and two numbers (singular and plural). It is inflected for gender, number and tense, among other things.

Learn More ›

What is Kannada known for? ›

Kannada is the second oldest of the four major Dravidian languages with a literary tradition. The oldest Kannada inscription was discovered at the small community of Halmidi and dates to about 450 ce. The Kannada script evolved from southern varieties of the Ashokan Brahmi script.

What is an unknown fact about Kannada? ›

Every Word In Kannada Ends With A Vowel

Probably the most unique fact about Kannada! Kannada has a total of 10 vowels, and surprisingly, every word in this language ends with a vowel.

Does Kannada have gender? ›

Gender (ಲಿಂಗ)

According to Keshiraja's Shabdamanidarpana, there are nine gender forms in Kannada. However, in modern Kannada literature only three gender forms are used in practice: masculine, feminine, and neuter. All Kannada nouns code for gender.

Discover More ›

Why Kannada is oldest language in the world? ›

Kannada is the oldest language along with Prakrit, Sanskrit, and Tamil. Linguists are of believing that Kannada branched off from the proto-Tamil South Dravidian division even before the Christian Era. This means it was spoken way before English and Hindi.

Read On ›

Is Kannada our mother tongue? ›

Kannada is the mother-tongue for the majority of the people in Karnataka.

Read On ›

Is it easy to learn Kannada? ›

If you're already familiar with languages that belong to the Dravidian language family, you might find Kannada easier to pick up. However, if you're new to languages with complex scripts and unfamiliar phonetics, it could be more challenging.

Learn More Now ›

Where is Kannada most spoken? ›

The Kannada language is spoken by more than 50 million people in the state of Karnataka and some of the surrounding regions of India.

Learn More Now ›

What is a fun fact about Kannada? ›

Did You Know These Unique Facts About The Kannada Language?

One Of The Oldest Languages. Image Courtesy: Jagran Josh. ...
Every Word In Kannada Ends With A Vowel. ...
There Is No Silent Letter In Kannada. ...
Only Indian Language That Has A Dictionary Written By A Foreigner. ...
Kannada Phrases In An Ancient Greek Play?

Mar 21, 2022

Show Me More ›

Which language is called King of all languages? ›

English is the foremost—and by some accounts the only—world language. Beyond that, there is no academic consensus about which languages qualify; Arabic, French, Russian, and Spanish are other possible world languages. Some authors consider Latin to have formerly been a world language.

Which is the mother of all languages? ›

In the beginning, Sanskrit stood as mother of all languages and encouraged all languages and was the reason for their growth and prosperity.

Discover More Details ›