Assamese Script Misrepresentation(Cmb)
Click here to load reader
-
Upload
dr-satyakam-phukan -
Category
Documents
-
view
370 -
download
57
description
Transcript of Assamese Script Misrepresentation(Cmb)
-
Dated Guwahati the 18th of March 2014
ToMr D. S. PeguThe Managing DirectorAssam Electronics Development Corporation Ltd (AMTRON)Bamunimaidam, GuwahatiAssam
Subject : Report on ASSAMESE SCRIPT MISREPRESENTATIONS IN INTERNATIONAL STANDARDS
Sir, I have working on the issue of the Assamese script misrepresentations in the International Standards since 2011. Recently I have returned from New Delhi attending a meeting of the Bureau of Indian Standards (BIS) on their invitation on the 5 th of February 2014. I have received the minutes of the meeting by email a few days back. A panel has been instituted to look into the entire issue and the Department of Information Technology, Government of Assam has been nominated as one of the members of the said panel.
It is in this context that I am enclosing a comprehensive report titled ASSAMESE SCRIPT MISREPRESENTATIONS IN INTERNATIONAL STANDARDS.
I hope that this will aid the Government of Assam in solving this long standing problem.
Thanking you.
Yours sincerely
Dr Satyakam PhukanGeneral SurgeonHemchandra RoadJorpukhuripar, UzanbazarGuwahati, AssamPhone : 99540 46357
Copy to :
1. Chief Secretary Government of Assam, Dispur, Guwahati
2. Mr Rajiv Kr Bora, IAS , Principal Secretary, Deptt. of IT, Government of Assam
3. Mr Jishnu Barua IAS, Principal Secretary to Hon'ble Chief Minister, Assam
4. Mr Anurag Goel, IAS, Commissioner & Secretary, Deptt. of IT, Government ofAssam
-
ASSAMESE SCRIPT MISREPRESENTATIONS IN
INTERNATIONAL STANDARDS
The International Alphabet of Sanskrit Transliteration (IAST) is a
transliteration scheme that allows a lossless romanization of Indic
scripts as employed by the Sanskrit language. IAST is based on a
standardestablishedbytheInternationalCongressofOrientalistsat
Genevain1894.ItallowsalosslesstransliterationofDevangar(and
otherIndicscripts,suchasradscript).
The IndianScript Code forInformationInterchange ISCII wasfirst
adoptedin1988.TheISCIIhasIASTasthebasisoftransliteration.
AnupdatedISCIIwasadoptedbytheBureauofIndianStandardsafter
thedraft finalisedbytheComputerMediaSectionalCommitteehas
been approved by the Electronics and Telecommunication Division
Councilin1991.
InoneofthebeginningparagraphsoftheISCIIdocumentitstatesthat
:
Thereare15officiallyrecognizedlanguagesinIndia:Hindi,Marathi,
Sanskrit, Punjabi, Gujarati, Oriya, Bengali, Assamese, Telugu,
Kannada,Malayalam,Tamil,Urdu,SindhiandKashmiri.Outofthese,
Urdu, Sindhi and Kashmiri are primarily written in PersoArabic
1
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
scripts,butgetwritteninDevanagaritoo(Sindhiisalsowritteninthe
Gujaratiscript).ApartfromPersoArabicscripts,alltheother10
scriptsusedforIndianlanguageshaveevolvedfromtheancient
Brahmiscriptandhaveacommonphoneticstructure,making
a common character set possible . The Northern scripts are
Devanagari,Punjabi,Gujarati,Oriya,BengaliandAssamese,whilethe
SouthernscriptareTelugu,Kannada,MalayalamandTamil.
TheISCIIcodetableisasupersetofallthecharactersrequiredinthe
tenBrahmibasedIndianscripts.Forconvenience,thealphabetof
theofficial script Devanagari (with diacritic marks for non
Devanagari alphabets) has been used in the standard. For
notational simplicity, elsewhere, the term Indian scripts implies
BrahmibasedIndianscripts.
ISCIIretainedmostofthetransliterationcharacteristicsoftheIAST.
Assamese script which was represented in the ISCII standard was
hence not properly represented since Assamese differs widely with
Sanskrit inphonology.TheIASTis notapplicable fortheAssamese
script.
2
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
In1991encodingcalledtheUnicodeStandardpreparedbytheUnicode
Consortium/Inc.wasstartedandit'sIndicscriptencodingtheysayis
based on ISCII . The Unicode encoding for the Indic scripts as
mentioned in manydocuments is supposed to be a superset of the
ISCII. This Unicode Standard is synchronised with the ISO 10646
maintained by the International Organisation for Standardization
(ISO).
TheAssamesealphabetswerenotseparatelyencodedbytheUnicode.
FollowingtheirpolicyofUnificationtheAssamesescriptwaseclipsed
intoBengaliintheUnicodeStandardbyUnicodeConsortium/Inc.The
uniquenessoftheAssamesescriptwasperhapsunknowntothemainly
American experts of Unicode Consortium/Inc. Unicode compensated
this by inclusion of two graphically dissimilar Assamese script
characters into Unicode/ISO10646Bengali codechart byconverting
themintoBengalicharacters.
Assameseletter""(Ra)isbeingdescribedasBengaliletter""(Ra)
withmiddlediagonal
Assamese letter "" (Waba)describedasBengali letter ""(Ra) with
lowerdiagonal.
wasnotrepresentedasaletterbutasaligaturei.e.aconjunct
formoftwoletters:
3
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
+=transliteratingasKhsya,
whereastheAssameselettertransliteratesas:
=Khya.
ThefactthatmanyoftheAssameselettersalthoughbeingsimilarin
graphicalformstoBengalilettershaveanentirely differentidentity
wasnotgivendueconsiderationbytheUnicodeStandard.Thesame
wasrepeatedinISO10646,asthisStandardissynchronisedwiththe
UnicodeStandard.
The Assamese script is in all total, misrepresented or absent in 4
internationalStandards:
A.ISO15924
InternationalStandardforNamesoftheScripts
B.ISO10646=UnicodeStandard
UniversalCharacterSet(UCS)
C.ISO15919
InternationalStandardforIndicScriptsTransliteration
4
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
D.ALALCRomanizationTable
RomanizationchartsmaintainedbyUSLibraryofCongress
The present status of the Assamese script in these International
Standardsaredescribedbelowindetails:
A.ISO15924
TheISOstandsforInternationalOrganizationforStandardizationThe
ISOis aninternationalrepresentativebody formedbyanetworkof
nationalstandardsbodies.Thesenationalstandardsbodiesmakeup
theISOmembershipandtheyrepresentISOintheircountry.InIndia
theGovernmentofIndia'sBureauofIndianStandards(BIS)represents
IndiainISO.TheISOpreparesStandardsforuseindiversefields.
This International Standardprovides a code for thepresentationof
names of scripts. The codes were devised for use in terminology,
lexicography, bibliography,andlinguistics,buttheymaybeusedfor
anyapplicationrequiringtheexpressionofscriptsincodedform.This
International Standard also includes guidance on the use of script
codesinsomeoftheseapplications.
ISO has appointed the Unicode Consortium as the Registration
AuthorityforthisInternationalStandard,ISO15924i.e.Codesforthe
5
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
representationofnamesofscripts. MichaelEversonofEvertypehas
beenappointedRegistrarbytheRegistrationAuthority.
StatusofAssamesescriptinISO15924:Notincluded
CopyofISO15924attachedheretoasDOCUMENTA.01.
B.ISO10646andUnicodeStandard
InternationalStandardISO/IEC10646,Informationtechnologydefines
theUniversalCharacterSet(UCS).ThefollowinglinesfromtheISO
10646documentarequotedbelow:
ThisInternationalStandardspecifiestheUniversalCodedCharacter
Set (UCS). It is applicable to the representation, transmission,
interchange,processing,storage,input,andpresentationofthewritten
formofthelanguagesoftheworldaswellasofadditionalsymbols.
This is the Standard in which the characters of a script which is
recognizedintheISO15924isencoded.Thisstandardissynchronized
with the Unicode Standard maintained by the Unicode Consortium
incorporated as a nonprofit company Unicode Incorporated in the
CaliforniastateofUnitedStatesofAmerica.
ItisinthesesynchronizedInternationalStandardsthatAssameseis
includedasasubsetofBengali.
Assameseletter""(Ra)isbeingdescribedasBengaliletter""(Ra)
6
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
withmiddlediagonal
Assamese letter "" (Waba) describedas Bengali letter ""(Ra) with
lowerdiagonal.
wasnotrepresentedasaletterbutasaligaturei.e.aconjunct
formoftwoletters:
+=transliteratingasKhsya,
whereastheAssameselettertransliteratesas:
=Khya.
By recent change the Unicode has made these as Additions for
AssameseandaddedthetermAssamese.UsingtheBengaliencoding
inthisStandardAssamesecanbetypedincomputer.Butapartfrom
thattheAssamesescripthasnoidentityinthisencodingandallother
functions apart fromtyping are distorted, disabled or handicapped.
Thisstateofaffairconstitutesgraveinjustice,donetotheAssamese
script reflecting onto the well being of the Assamese languageand
peopleingeneral.TheBengaliCodeChartcurrentversionisattached
heretoasDOCUMENTB.01.
MyselfandmyfriendPastorAzizulHaquehaverepresentedtothe
UnicodeConsortiumseekingrectificationofthisgraveinjusticedoneto
theAssamesescriptbyemailsdated13thand21stofJuly2011.Please
7
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
see DOCUMENT B.02. The Unicode Consortium responded by
writingonthemattertotheDepartmentof InformationTechnology
Government of India. Copy of the document attached hereto as
DOCUMENTB.03.
TheGovernmentofIndiahasalsorespondedandsoughttheopinionof
therespectivestateGovernmentsofthestatesofAssam,WestBengal,
BiharandManipur.ThedocumentattachedheretoasDOCUMENT
B.04.
Onthe9th ofJanuary2012,PastorAzizulHaqueandmyselfsenta
memorandumtotheHonbleChiefMinisterofAssam,MrTarunGogoi,
onthesubjectmatterNonrepresentation/Erroneousnomenclatureof
the Assamese script/writing system in the Unicode Character Set
(U.C.S) of theUnicodeConsortium.withtheappealtotakeupthe
matterandtakestepstoensureandobtainaseparateslot/range/place
fortheAssamesescript/writingsystemintheUniversalCharacterSet
(UCS)oftheUnicodeConsortium.Onthe18th ofFebruary2012the
DepartmentofInformationTechnology,GovernmentofAssamsentan
official communication to the Department of Electronics and
Information Technology, Government of India for requesting the
Unicode Consortium to allot a separate slot/range/block for the
8
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Assamese script. Document attached herewith as DOCUMENT
B.05.
Following that on the On the 13th of June 2012, a meeting was
organised by the Department of Electronics and Information
Technology, Government of India, in NewDelhi on the issue of the
AssameseandUnicode.Copyoftheminutesofthemeetingattached
heretoasDOCUMENTB.06.
AtpresenttheissuehasshiftedintotherealmoftheBureauofIndian
Standards(BIS)throughISO,whichistobedescribedbelow.
C.ISO15919
TheISO15919is thetransliterationstandardforIndicscripts. The
followinglinesarequotedfromthere:
1Scope
This International Standard provides tables which enable the
transliterationintoLatincharactersfromtextinIndicscriptswhich
arelargelyspecifiedinrows09to0DofUCS(ISO/IEC106461and
Unicode).
The tables provide for the Devanagari, Bengali (including the
9
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
charactersusedforwritingAssamese),Gujarati,Gurmukhi,Kannada,
Malayalam,Oriya,Sinhala,Tamil,andTeluguscriptswhichareused
inIndia,Nepal,
Bangladesh and Sri Lanka. The Devanagari, Bengali, Gujarati,
Gurmukhi, and Oriya scripts are North Indian scripts, and the
Kannada, Malayalam, Tamil, and Telugu scripts are South Indian
scripts.
ThescriptconversionasperISO15919isrequiredScriptconversion
isoftenrequiredfordocumentssuchashistoricalandliterarytexts,
geographical texts (including maps and atlases), bibliographies,
catalogues,listsandpassports(andotheridentificationdocuments).
TextinDevanagariscriptorotherIndicscriptssometimesneedstobe
showninLatinscript,whereusers,orequipmentthattheyareusing,
cannot read or write the text. Copy of the ISO 15919 document
attachedheretoasDOCUMENTC.01.
ThetransliterationchartforAssameseIfoundwasmissingfromISO
15919andtheBengalitransliterationchartprovidedtherecannotbe
appliedforAssamesescript.
IwrotebyemailtotheInternationalOrganizationforStandardization
(ISO) on the 21st of July 2012 asking for help in correction of the
transliterationerrorofAssamesescriptinISO15919.
10
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
On the 2nd of October 2012 I received a reply from the ISO who
informedmethat IshouldtakeupthematterwithBureauofIndian
Standards(BIS),theIndianGovernment'srepresentativeintheISO.
AccordinglyIapproachedtheBISontheissueofAssamesescriptin
ISO15919byemaildated18thofOctober2012.IwasrepliedbyMrN
KPaltheHeadoftheMSDdivision oftheBISatthattimebeing
askingmetosubmitaproposalforthesame,relevantportionofhis
communicationquotedbelow:
Wehope that we have interpreted correctly that you want separate
tablestobeincludedfortransliterationintoLatincharactersfromtext
in Assamese script instead of clubbing them together with Bengali
scriptsashasbeenpresentlydone.
InordertopointouttheprobleminrightperspectivetoISO,couldwe
requestyoutokindlyprovidetheexactchanges(clausewise)youwould
like topropose inthe existingISO15919, a copyof whichis hereby
enclosedforyourreadyreferenceplease.Itwouldbeappreciatedifany
documentaryevidenceinsupportofyourcommentsbeprovidedtousfor
facilitatingthedecision.
You are further informed that the International Standard, ISO
15919:2001,alongwithyourspecificcommentswillbecirculatedtoall
membersofMSD5SectionalCommitteeforitsconsideration.Basedon
11
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
the decisionof the MSD5Sectional Committee, ISO/TC46will be
formallyrequestedbyBIStosuitablyamendISO15919:2001.
Wewishanearlyresolutiontothisproblemandweassureyouthatwe
willkeepyoupostedonthedevelopments
Onthe24th ofNovember2012IsentaproposalregardingAssamese
script in ISO 15919. A copy of the proposal attached herewith as
DOCUMENTC.02.
Myproposal was discussed in discussed in the14th meetingof the
Documentation and Information Sectional Committee, MSD 5 (the
NationalMirrorCommitteetoISO/TC46)heldon14December2012at
BISNewDelhi.ThedecisionwasthatinordertoincludeAssamese
script in ISO15919, Assamese scripts needs to be included in ISO
106461.ForthismatterwashandedovertotheLITDdivisionofthe
BIS.Relevantportionsoftheminutesquotedbelow:
TheCommitteeconsideredtheinformationgiveninitem10.3of the
Agenda regarding suggestions from Dr. Satyam Phukan on ISO
15919:2001 Information and documentationTransliteration of
DevanagriandrelatedIndicscriptsintolatincharactersforcorrections
inAssamesetransliterationrequiredinthisISOStandardbyproviding
separatetablestocoverthetransliterationofAssamesecharactersinto
12
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
latin,asbecausetranscriptionandtransliterationof Assamesescript
wasdifferentfromBengali.DetailedproposalreceivedfromDr.Satyam
Phukansubsequentlywasalsotabledduringthemeeting.
TheCommitteenotedthatDrPhukan,inhisdetailedproposal,hasalso
pointedoutthatintheInternational StandardISO/IEC106461on
InformationtechnologyUniversalMultiOctetCharacterset(UCS)
Part1:Architectureandbasicmultilingualplane(whichisanecessary
adjuncttoISO15919:2001),theAssamesescriptwasnotrecognizedasa
separate,distinctscriptfromBengaliwhichneedscorrectionfirst.He
alsoinformedthattheDepartmentofInformationTechnology,Govt.of
AssamhadalreadysentaproposaltothateffecttotheDIT,Govtof
Indiafortakingthenecessarystepsforobtainingaseparaterangefor
theAssamesescriptinUnicodeinISO/IEC106461standard.
He, therefore, proposed that necessary steps shall first be taken for
obtaining a separate range/block in ISO106461standard and only
afterthattheproposalforprovidingseparateTransliterationtablesfor
AssamesescriptinISO15919standardwillbefeasible. Copyofthe
emailstringscontainingcommunicationwiththeISOandtheBISup
tothispointoftimeisattachedheretoasDOCUMENTC.03.
SubsequenttothisI wasaskedbytheBIStoprovidecommentson
inscriptkeyboardlayoutsfortypingAssamesescriptincomputersand
13
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ontheISO10646,Iprovidedthesametothem.Iattachcopiesofthe
sameasDOCUMENTC.04andDOCUMENTC.05respectively.
Onthe10th ofJanuary2014Iwasinvitedtoattendandspeakinthe
FifthMeetingofLITD20onthe5thofFebruary2014tobeheldatthe
BISofficeinNewDelhi.Theagendaforthesaidmeetingspecificfor
myrequirementofattendancewasasfollows:
4. COMMENTS RECEIVED ON ISO 106461 IT UNIVERSAL
CODEDCHARACTERSET
4.1Dr.SatyakamPhukan,hassentaproposalforseparateUnicodefor
Assamese language in ISO 106461 Information technology
UniversalCodedCharacterSet(UCS).ThisISOStandardspecifiesthe
universal coded character set and applicable to the representation,
transmission,interchange,processing,storage,input,andpresentation
ofthewrittenformofthelanguagesoftheworld.
Inthisstandard,AssameselanguageisgivenunderBengaliscriptwith
differencesmentionedseparatelyasgiveninenclosedfile.Dr.Phukan
mentionedthatAssamese is aseparatescript andnotasubscriptof
Bengali script. Thus, separate Universal codedcharacter set is to be
providedtoAssamesescriptinISO/IEC106461standardbyissuing
anamendmenttothesame.Theseparateuniversalcodeasproposedby
himismentionedinenclosedfile.
14
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
TheCommitteemaydecide.
TheAgendadocumentattachedherewithasDOCUMENTC.06.
Iattendedthemeeting takingalongwithmetwootherpersonsMr
DurlavGogoianAssamesesoftwaremakerandDrBhaskarjyotiSarma
Asst. Prof. Deptt. of Assamese Dibrugarh University. I gave a
presentationontheissueofAssameseScriptandtheISOStandards,
stressingontheneedforaseparaterange/slot/placefortheAssamese
scriptinISO106461Standard.Thematterisnowhandedovertoa
panelheadedbynotedscholarDrPeribhaskarRao,whoaretoexamine
theissuerelatingtotheAssameselanguage.Relevantportionsfrom
theminutesquotedbelow:
4. COMMENTSRECEIVEDONISO106461 IT UNIVERSAL
CODEDCHARACTERSET
4.1Dr.SatyakamPhukan,gaveapresentationonhisproposalfora
separate place/slot/range for Assamese script in ISO 106461
InformationtechnologyUniversalCodedCharacterSet(UCS).He
mentionedthatAssamese is aseparatescript andnotasubscriptof
Bengaliscript.However,thecommitteethinksthatthisissueneedstobe
discussedseparatelyindetail.ThecommitteedecidedtoformaPanel
regardingtheissuesraisedbyDr.Phukan.Theworkofthepanelisto
15
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
examinetheissuesrelatedtoAssameselanguageasitrelatestovarious
ISOstandards.Thecompositionofthepanelwillbeasfollowssubjectto
theiracceptance:
a) ShriPeriBhaskarraoConvener
b) Dept.ofIT,Assam
c) ShriManojJain,DeiTY
d) ShMaheshKulkarni,CDAC
e) Dr.DilipKumarKalita,Abilac,Assam
f) SecretaryofIT,WestBengal
4.2Basedontherecommendationsofthepanel,furtheractionwillbe
takeninthenextmeetingofthiscommittee.
CopyoftheminutesofthemeetingattachedheretoasDOCUMENT
C.07.
D.ALALCRomanizationTables
AmericanLibraryAssociationLibraryofCongresssetsstandardsfor
romanization,ortherepresentationof text inotherwritingsystems
usingtheLatinscript.ThisstandardismaintainedbytheGovernment
oftheUnitedStatesofAmerica'sLibraryofCongress.
ThissystemisusedbytheNorthAmericanlibrariesandtheBritish
16
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Library. Assamese is one of the languages represented in the said
standard.Copyofthehomepageofthesaidstandardintheinternetis
attached hereto as DOCUMENT D.01. The standard also has
Bengaliasoneofthelanguages.SurprisinglythetheTablesforthe
Assamesehasbeenpresentedexactlywiththesamecontentwiththe
Bengali. Copies of the two Tables Assamese and Bengali attached
hereto as DOCUMENT D.02 and DOCUMENT D.03
respectively.IcommunicatedtotheUnitedStatesLibraryofCongress
authorities and presented a corrected form of the Assamese
Romanizationbutwithoutseparatemarkersforthemultiplegrapheme
representing a solitary phoneme. Copy of my corrected document
attachedheretoasDOCUMENTD.04.TheUnitedStatesLibrary
ofCongressauthoritiesrejectedmyproposalbyanemaildated2nd of
November2012.Therelevantportionof theircommunicationquoted
below:
WithregardtoMr.SatyakamPhukanssuggestion,wecantusethe
Romanizationtablepurelybasedonpronunciation.Thereasonsare:
1. ALALCRomanizationtablesaredevelopedforusewhenthe
consistenttransliterationofaNonRoman(vernacular)scriptintothe
RomanAlphabetisneeded.
17
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
2. Romanization attempts to transliterate the original script, the
guiding
principleisaonetoonemappingofcharactersinthesourcelanguage
intothetargetscript,withlessemphasisonhowtheresultsoundswhen
pronouncedaccordingtothereader'slanguage.
3. Itwouldbevirtuallyimpossibletoretrievetheoriginalwordin
AssameselanguagefromtheRomanizedwordbasedonpronunciation.
OnlyshortcomingthatmyproposalwashavingisthatIdidnotemploy
markerstographemeofletterswhichrepresentsasinglephonemein
theAssamesealphabet.Hadtheytoldmethat,Imighthavebeenable
toalterittotheirneeds.Butthey closedanyscopeforthat bythe
following statement, I quote below : The current Assamese
romanizationtablereflectsthegoalsoftheALALCromanizationtables
asdevelopedbythelibrarycommunity.Weappreciateyourinterestin
theAssameseromanizationtable.Pleaseletusknowifwemaybeof
furtherassistance.
The communications with the United States Library of Congress
authoritiesattachedheretoasDOCUMENTD.05.
18
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
CONSEQUENCESOFTHEMISREPRESENTATIONSOFTHEASSAMESESCRIPTININTERNATIONALSTANDARDS
TheimmediateeffectsofthemisrepresentationsoftheAssamesescript
in the International Standards are the eclipsation of the Assamese
script in the ISO 106461/Unicode Standard followed by the non
inclusionoftheAssamesescriptinallotherInternationalStandards
where it should have had it's presence. The net results can be
summarizedasfollows:
1. LossofidentityoftheAssameseScript
Inthepresentsituation,ifnotrectifiedbycollectiveeffortofthepeople
andtheGovernmentofAssamthereisnothingcalledtheASSAMESE
SCRIPTintheNationalandInternationalStandards.
2. Lossofhistoricalheritagehundredsofyears
old
Assamesescriptisoneoftheoldestorsayoneofthemostancientof
the Indic scripts. Specimens of this script in stone and metal
inscriptionshavebeenfoundinsitesnotonlyinAssambutalsointhe
19
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Arakan/Rakhinestateof Myanmar/Burmadatingbacktotheeraas
earlyasthe5th / 6th centuryAD.Infactsomeof thebestpreserved
specimens of Assamese script have been discovered in
Arakan/RakhinestateofMyanmar/Burma.Theancientinscriptionin
stone,metalandinwritingsareinthreemainlanguagesAssamese,
SanskritandPali.
20
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
TheAssamesescriptwasdevelopedinAssamduringtheKamrupera
ofAssam'shistory.Thescriptchangesitscharacteraccordingtothe
languagesinwhichitisused.ThescriptwhenwritteninAssameseisa
differentonefromtheoneinwhichSanskritorlanguagesfollowingthe
Sanskrit type of phonology is written. It is this difference which
differentiatestheBengali scriptfromtheAssamesescript. Although
thegraphicalrepresentationsofthemanyofthelettersaresimilarin
appearancealargenumberofthemrepresentstotallydifferententities.
SimilarsituationistherebetweenthethreemajorEuropeanscripts
namelyLatin,GreekandCyrillic.Thesesimilarlookingcharactersof
Latin,GreekandCyrillicscriptshavedifferentrepresentationinthe
InternationalencodingsnamelyISO10646andUnicodeStandard.The
same principle can be applied for giving separate encodings for
AssameseandBengaliscripts.Thisrepresentationofmultipleformsin
computerparlanceisknownasDuplication.AchartofLatin,Greek
andCyrillicduplicationisattachedherewithasDOCUMENTE.01.
3. HandicapsanddisabilitiesintheoperationoftheAssamesescript
Exceptfortheabilitytotypeincomputers,mostotherfunctionsthat
needs to be performed in the operation of the Assamese script are
21
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
distorted,disabledorhandicapped.Thisisfurthercompoundedbythe
factthatAssamesescriptismissingfromalltheISOstandardsthat
arethereforscriptsoftheworld.
DuetheabsenceofthegraphicalformofthelastletteroftheAssamese
alphabet, (Khya)inISO10646/Unicode,propersortingoperation
isimpossibleinthepresentAssamesescript(includedinBengali).
While translating a present Assamese script (included in Bengali)
webpageontheInternetthetranslationtakesplacebetweenBengali
andthetarget language, for exampleEnglish. Screenshot pasted
below
WhilesearchingforanymatterinthesearchenginesintheAssamese
22
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
script (included inside Bengali) two phenomenon are noted. If the
searchword/wordscontainthetwographicallydissimilarletters are
thenallthereferencesthatsurfacesinthesearchareAssamese.But
iftheyarenottherethenmajorityofthesearchresultsparticularlyin
the first page are invariably Bengali. Document showing this
phenomenon attached herewith as DOCUMENT E.02 and
DOCUMENTE.03.
Ihaveseenthesethingfrommypracticalexperiencesbutmoresuch
technicalproblemsaresuretodiscoveredbymanyothernoworinthe
future.
ButthecomputerexpertsoftheGovernmentofIndia(DEITy)havea
solutionforallthesebyusingapatchingsoftwareforrectifyingthese.
Infactinthecomputerworldthereisapatchforallproblems.Ihave
personallycometorealisethisimportantfactbymyinteractionswith
theofficialsoftheDEITyGovernmentofIndiainthemeetingheldon
the5thofFebruary2014attheManakBhawanofficeoftheBureauof
IndianStandards(BIS)inNewDelhi.Inthisparticularinstancethe
patcheswillberequiredfortheusersintheAssamesescriptincluded/
eclipsedinsideBengalinotfortheusersinBengali.Ifwearetogoby
their countenance, for the future generations of Assamese script in
computerusagewewillbeleavingforthemwithagiftofapatchfull,
23
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
crippledandclumsyscript.
CONCLUSION
1. IthighlyessentialthatAssamesescriptisencodedseparatelyfrom
Bengali in all the International Standards, provided we do not
procrastinate and agree to have a defective and crippled Assamese
scriptforuseincomputers.
2. TodothisitshouldbeconclusivelyprovedthatAssameseisindeeda
separate script in spite of having a large of number of graphic
characterssimilaringraphicforms.
3. It should be clearly shown that many of these similar graphic
charactersareinrealityhavingdifferentidentity.
4. Thebasisforthedifferingidentityliesinthedifferingphonologyof
the Assamese and Bengali. This difference between Assamese and
BengaliisapplicabletoallotherIndianscriptsandthisbasicdifference
therefore,isinrealitythedifferencebetweenAssameseandSanskrit
scripts.
24
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
5. The difference between Assamese and Sanskrit becomes most
obviousinthetransliterationandtranscriptionof thesescripts. The
presentprogressintheissuehasbeeninitiatedfrommypointingout
transliteration errors in Assamese script in the ISO 15919:2001
Standard.
6. A huge degree of omission on the part of the present Assamese
scholarsisthereinthematterofhighlightingthetransliterationand
transcriptiondifferencesbetweenAssameseandSanskrit.
7. ItisbythisactofSanskritisationoftheAssamesescriptandalsoof
thelanguage thatthepresentcrisisoftheAssamesescripthasbeen
generated.
8. ItistheoutmostdutyoftheGovernmentofAssamtoclearallsuch
misrepresentationsof theAssamesescriptandlanguagefirst inthe
homefrontitselfandthenonlywecanproceedwiththerectificationof
the misrepresentationsof theAssamesescriptatthenationaland
internationallevel.
9. The only effect but not essentially a problem will occur if the
Assamese and Bengali are given two distinct scripts in the
25
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
InternationalEncodingsisduplication.Duplicationasmentionedabove
isalreadythereinISO10646/UnicodeStandardbetweenLatin,Greek
andCyrillic,inadditiontothatthereissomeamountofduplication
between the southern Indian scripts and the Myanmar and the
Chakma scripts. There has been some problems caused by
unscrupulouspersonsontheInternetbyresortingtoillegalactivities
likephising.onlyincaseoftheLatin,GreekandCyrillicscripts.But
primarysituationbetweenthesescriptsandthosebetweenAssamese
andBengaliarenotthesame.Moreoverthedecisionmakingpoweron
whethertheyAssameseandBengaliscriptsbeseparatedornotshould
beinthehandsoftheGovernmentsofAssamandWestBengalandthe
sovereigncountryofBangladeshandthepeopleoftheseplaces.Ifthe
respectiveGovernmentsandthepeopledecidesoandtakethedecision
tobeartheconsequencesifany,thentheInternationalOrganisations
andtheAmericancompanynamedUnicodeIncorporatedshouldcomply
withthesame.
10. Itisalsotoberememberedattheendthatconservationofone's
script is aRightguaranteedbyTheConstitutionof India. Article
29(1)oftheConstitutionofIndiastatesasfollows:
"CulturalandEducationalRights
26
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
29. (1)AnysectionofthecitizensresidingintheterritoryofIndiaorany
partthereofhavingadistinctlanguage,scriptorcultureofitsownshall
havetherighttoconservethesame."
Hencetheidentityofthisancientscriptshouldnotbeallowedtobe
destroyedandbeextinctatall cost inthenameof technologyand
modernization.
DrSatyakamPhukan
GeneralSurgeon
HemChandraRoad
Jorpukhuripar,Uzanbazar
Guwahati,Assam
P.I.N:781001
Phone:9954046357
Dated:Guwahatithe18thofMarch2014
27
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
1/7www.unicode.org/iso15924/iso15924-codes.html
ISO15924CodeLists Previous|RAHome|Next
CodesfortherepresentationofnamesofscriptsCodespourlareprsentationdesnoms
dcritures
Table1Alphabeticallistoffour-letterscriptcodes
Listealphabtiquedescodetsdcriturequatrelettres
Code N EnglishName Nomfranais PropertyValueAlias Date
Afak 439 Afaka afaka 2010-12-21
Aghb 239 CaucasianAlbanian aghbanien 2012-10-16Arab 160 Arabic arabe Arabic 2004-05-01
Armi 124 ImperialAramaic aramenimprial Imperial_Aramaic 2009-06-01
Armn 230 Armenian armnien Armenian 2004-05-01
Avst 134 Avestan avestique Avestan 2009-06-01
Bali 360 Balinese balinais Balinese 2006-10-10
Bamu 435 Bamum bamoum Bamum 2009-06-01
Bass 259 BassaVah bassa 2010-03-26Batk 365 Batak batik Batak 2010-07-23
Beng 325 Bengali bengal Bengali 2004-05-01
Blis 550 Blissymbols symbolesBliss 2004-05-01Bopo 285 Bopomofo bopomofo Bopomofo 2004-05-01
Brah 300 Brahmi brahma Brahmi 2010-07-23
Brai 570 Braille braille Braille 2004-05-01
Bugi 367 Buginese bouguis Buginese 2006-06-21
Buhd 372 Buhid bouhide Buhid 2004-05-01
Cakm 349 Chakma chakma Chakma 2012-02-06
Cans 440 UnifiedCanadianAboriginalSyllabicssyllabaireautochtonecanadienunifi
Canadian_Aboriginal 2004-05-29
Cari 201 Carian carien Carian 2007-07-02
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
2/7www.unicode.org/iso15924/iso15924-codes.html
Cham 358 Cham cham(am,tcham) Cham 2009-11-11
Cher 445 Cherokee tchrok Cherokee 2004-05-01
Cirt 291 Cirth cirth 2004-05-01Copt 204 Coptic copte Coptic 2006-06-21
Cprt 403 Cypriot syllabairechypriote Cypriot 2004-05-01Cyrl 220 Cyrillic cyrillique Cyrillic 2004-05-01
Cyrs 221 Cyrillic(OldChurchSlavonicvariant)cyrillique(varianteslavonne) 2004-05-01
Deva 315 Devanagari(Nagari) dvangar Devanagari 2004-05-01Dsrt 250 Deseret(Mormon) dseret(mormon) Deseret 2004-05-01
Dupl 755
Duployanshorthand,Duployanstenography
stnographieDuploy 2010-07-18
Egyd 070 Egyptiandemotic dmotiquegyptien 2004-05-01Egyh 060 Egyptianhieratic hiratiquegyptien 2004-05-01
Egyp 050 Egyptianhieroglyphshiroglyphesgyptiens
Egyptian_Hieroglyphs 2009-06-01
Elba 226 Elbasan elbasan 2010-07-18
Ethi 430 Ethiopic(Geez) thiopien(geez,guze) Ethiopic 2004-10-25
Geok 241Khutsuri(AsomtavruliandNuskhuri)
khoutsouri(assomtavroulietnouskhouri)
Georgian 2012-10-16
Geor 240 Georgian(Mkhedruli) gorgien(mkhdrouli) Georgian 2004-05-29
Glag 225 Glagolitic glagolitique Glagolitic 2006-06-21
Goth 206 Gothic gotique Gothic 2004-05-01
Gran 343 Grantha grantha 2009-11-11Grek 200 Greek grec Greek 2004-05-01
Gujr 320 Gujarati goudjart(gujrt) Gujarati 2004-05-01Guru 310 Gurmukhi gourmoukh Gurmukhi 2004-05-01
Hang 286 Hangul(Hangl,Hangeul)hangl(hangl,hangeul) Hangul 2004-05-29
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
3/7www.unicode.org/iso15924/iso15924-codes.html
Hani 500 Han(Hanzi,Kanji,Hanja)
idogrammeshan(sinogrammes)
Han 2009-02-23
Hano 371 Hanunoo(Hanuno) hanouno Hanunoo 2004-05-29
Hans 501 Han(Simplifiedvariant)idogrammeshan(variantesimplifie) 2004-05-29
Hant 502 Han(Traditionalvariant)
idogrammeshan(variantetraditionnelle)
2004-05-29
Hebr 125 Hebrew hbreu Hebrew 2004-05-01
Hira 410 Hiragana hiragana Hiragana 2004-05-01
Hluw 080
AnatolianHieroglyphs(LuwianHieroglyphs,HittiteHieroglyphs)
hiroglyphesanatoliens(hiroglypheslouvites,hiroglypheshittites)
2011-12-09
Hmng 450 PahawhHmong pahawhhmong 2004-05-01
Hrkt 412
Japanesesyllabaries(aliasforHiragana+Katakana)
syllabairesjaponais(aliaspourhiragana+katakana)
Katakana_Or_Hiragana
2011-06-21
Hung 176 OldHungarian(HungarianRunic)runeshongroises(ancienhongrois) 2012-10-16
Inds 610 Indus(Harappan) indus 2004-05-01
Ital 210 OldItalic(Etruscan,Oscan,etc.)
ancienitalique(trusque,osque,etc.)
Old_Italic 2004-05-29
Java 361 Javanese javanais Javanese 2009-06-01
Jpan 413Japanese(aliasforHan+Hiragana+Katakana)
japonais(aliaspourhan+hiragana+katakana)
2006-06-21
Jurc 510 Jurchen jurchen 2010-12-21Kali 357 KayahLi kayahli Kayah_Li 2007-07-02Kana 411 Katakana katakana Katakana 2004-05-01
Khar 305 Kharoshthi kharochth Kharoshthi 2006-06-21
Khmr 355 Khmer khmer Khmer 2004-05-29
Khoj 322 Khojki khojk 2011-06-21
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
4/7www.unicode.org/iso15924/iso15924-codes.html
Knda 345 Kannada kannara(canara) Kannada 2004-05-29
Kore 287 Korean(aliasforHangul+Han)coren(aliaspourhangl+han) 2007-06-13
Kpel 436 Kpelle kpll 2010-03-26Kthi 317 Kaithi kaith Kaithi 2009-06-01
Lana 351 TaiTham(Lanna) tatham(lanna) Tai_Tham 2009-06-01Laoo 356 Lao laotien Lao 2004-05-01
Latf 217 Latin(Frakturvariant) latin(variantebrise) 2004-05-01
Latg 216 Latin(Gaelicvariant)latin(variantegalique) 2004-05-01
Latn 215 Latin latin Latin 2004-05-01
Lepc 335 Lepcha(Rng) lepcha(rng) Lepcha 2007-07-02Limb 336 Limbu limbou Limbu 2004-05-29
Lina 400 LinearA linaireA 2004-05-01Linb 401 LinearB linaireB Linear_B 2004-05-29
Lisu 399 Lisu(Fraser) lisu(Fraser) Lisu 2009-06-01Loma 437 Loma loma 2010-03-26Lyci 202 Lycian lycien Lycian 2007-07-02
Lydi 116 Lydian lydien Lydian 2007-07-02
Mahj 314 Mahajani mahjan 2012-10-16Mand 140 Mandaic,Mandaean manden Mandaic 2010-07-23Mani 139 Manichaean manichen 2007-07-15Maya 090 Mayanhieroglyphs hiroglyphesmayas 2004-05-01Mend 438 Mende mend 2010-03-26
Merc 101 MeroiticCursive cursifmrotique Meroitic_Cursive 2012-02-06
Mero 100 MeroiticHieroglyphs hiroglyphesmrotiquesMeroitic_Hieroglyphs 2012-02-06
Mlym 347 Malayalam malaylam Malayalam 2004-05-01
Mong 145 Mongolian mongol Mongolian 2004-05-01
Moon 218Moon(Mooncode,Moonscript,Moontype)
critureMoon 2006-12-11
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
5/7www.unicode.org/iso15924/iso15924-codes.html
Mroo 199 Mro,Mru mro 2010-12-21
Mtei 337 MeiteiMayek(Meithei,Meetei) meiteimayekMeetei_Mayek 2009-06-01
Mymr 350 Myanmar(Burmese) birman Myanmar 2004-05-01
Narb 106OldNorthArabian(AncientNorthArabian)
nord-arabique 2010-03-26
Nbat 159 Nabataean nabaten 2010-03-26
Nkgb 420NakhiGeba('Na-'KhiGg-baw,NaxiGeba)
nakhigba 2009-02-23
Nkoo 165 NKo nko Nko 2006-10-10
Nshu 499 Nshu nshu 2010-12-21Ogam 212 Ogham ogam Ogham 2004-05-01
Olck 261 OlChiki(OlCemet,Ol,Santali) oltchiki Ol_Chiki 2007-07-02
Orkh 175 OldTurkic,OrkhonRunic orkhon Old_Turkic 2009-06-01
Orya 327 Oriya oriy Oriya 2004-05-01
Osma 260 Osmanya osmanais Osmanya 2004-05-01
Palm 126 Palmyrene palmyrnien 2010-03-26Perm 227 OldPermic ancienpermien 2004-05-01Phag 331 Phags-pa phagspa Phags_Pa 2006-10-10
Phli 131 InscriptionalPahlavi pehlevidesinscriptionsInscriptional_Pahlavi 2009-06-01
Phlp 132 PsalterPahlavi pehlevidespsautiers 2007-11-26Phlv 133 BookPahlavi pehlevideslivres 2007-07-15Phnx 115 Phoenician phnicien Phoenician 2006-10-10
Plrd 282 Miao(Pollard) miao(Pollard) Miao 2012-02-06
Prti 130 InscriptionalParthianparthedesinscriptions
Inscriptional_Parthian 2009-06-01
Qaaa 900 Reservedforprivateuse(start)rservlusagepriv(dbut) 2004-05-29
Qabx 949 Reservedforprivateuse(end)rservlusagepriv(fin) 2004-05-29
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
6/7www.unicode.org/iso15924/iso15924-codes.html
Rjng 363 Rejang(Redjang,Kaganga)
redjang(kaganga) Rejang 2009-02-23
Roro 620 Rongorongo rongorongo 2004-05-01Runr 211 Runic runique Runic 2004-05-01
Samr 123 Samaritan samaritain Samaritan 2009-06-01
Sara 292 Sarati sarati 2004-05-29
Sarb 105 OldSouthArabian sud-arabique,himyariteOld_South_Arabian 2009-06-01
Saur 344 Saurashtra saurachtra Saurashtra 2007-07-02
Sgnw 095 SignWriting Signcriture,SignWriting 2006-10-10
Shaw 281 Shavian(Shaw) shavien(Shaw) Shavian 2004-05-01Shrd 319 Sharada,rad charada,shard Sharada 2012-02-06Sind 318 Khudawadi,Sindhi khoudawad,sindh 2010-12-21Sinh 348 Sinhala singhalais Sinhala 2004-05-01
Sora 398 SoraSompeng sorasompeng Sora_Sompeng 2012-02-06
Sund 362 Sundanese sundanais Sundanese 2007-07-02
Sylo 316 SylotiNagri sylotngr Syloti_Nagri 2006-06-21Syrc 135 Syriac syriaque Syriac 2004-05-01
Syre 138 Syriac(Estrangelovariant)syriaque(varianteestranghlo) 2004-05-01
Syrj 137 Syriac(Westernvariant)syriaque(varianteoccidentale) 2004-05-01
Syrn 136 Syriac(Easternvariant)syriaque(varianteorientale) 2004-05-01
Tagb 373 Tagbanwa tagbanoua Tagbanwa 2004-05-01
Takr 321 Takri,kr,kr tkr Takri 2012-02-06Tale 353 TaiLe ta-le Tai_Le 2004-10-25
Talu 354 NewTaiLue nouveauta-lue New_Tai_Lue 2006-06-21
Taml 346 Tamil tamoul Tamil 2004-05-01
Tang 520 Tangut tangoute 2010-12-21Tavt 359 TaiViet tavit Tai_Viet 2009-06-01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
10/20/12 ISO 15924 - Alphabetical Code List
7/7www.unicode.org/iso15924/iso15924-codes.html
Telu 340 Telugu tlougou Telugu 2004-05-01
Teng 290 Tengwar tengwar 2004-05-01Tfng 120 Tifinagh(Berber) tifinagh(berbre) Tifinagh 2006-06-21
Tglg 370 Tagalog(Baybayin,Alibata)tagal(baybayin,alibata) Tagalog 2009-02-23
Thaa 170 Thaana thna Thaana 2004-05-01
Thai 352 Thai tha Thai 2004-05-01
Tibt 330 Tibetan tibtain Tibetan 2004-05-01
Tirh 326 Tirhuta tirhouta 2011-12-09Ugar 040 Ugaritic ougaritique Ugaritic 2004-05-01
Vaii 470 Vai va Vai 2007-07-02
Visp 280 VisibleSpeech parolevisible 2004-05-01
Wara 262 WarangCiti(VarangKshiti) warangciti 2009-11-11
Wole 480 Woleai wola 2010-12-21
Xpeo 030 OldPersian cuniformeperspolitain Old_Persian 2006-06-21
Xsux 020 Cuneiform,Sumero-Akkadiancuniformesumro-akkadien Cuneiform 2006-10-10
Yiii 460 Yi yi Yi 2004-05-01
Zinh 994 Codeforinheritedscriptcodetpourcriturehrite Inherited 2009-02-23
Zmth 995 Mathematicalnotationnotationmathmatique 2007-11-26
Zsym 996 Symbols symboles 2007-11-26
Zxxx 997 Codeforunwrittendocumentscodetpourlesdocumentsnoncrits 2011-06-21
Zyyy 998 Codeforundeterminedscriptcodetpourcritureindtermine Common 2004-05-29
Zzzz 999Codeforuncodedscript
codetpourcriturenoncode Unknown 2006-10-10
Code N EnglishName Nomfranais PropertyValueAlias Date
Copyright20042012ISO,Unicode,Inc.,&Evertype.AllRightsReserved
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.02
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.02
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.02
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.02
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
From: "Lisa Moore" To: "Swaran Lata" ; "Manoj Jain" Cc: Subject: Assamese writing system in UnicodeDate: Wednesday, August 17, 2011 11:24 PM
Dear Manoj and Swaran,
The Unicode office has recently received several emails from some members of the
Assamese community, objecting to the way Assamese is addressed in Unicode. They
feel Assamese is considered a sub-class to Bengali in Unicode, and it should be given
its own block.
Currently, two characters in the Bengali code block have annotations Assamese and
the text of the Unicode Standard states that the Bengali script is used to write
Assamese in Assam and a number of other minority languages. Based on our review,
the Bengali script adequately covers the Assamese language.
We have replied to the Assamese authors that we are in receipt of their emails and
will be in contact with the Government of India regarding this request.
As this request is coming from India and has political implications, we feel it is an
issue that the Government of India will wish to address. The Unicode Technical
Committee will take no action based on the current correspondence. If you would
like copies of the various emails, please let us know. I append one such email below.
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Most sincerely,
Lisa MooreChair, Unicode Technical Committee
From: SATYAKAM PHUKAN [mailto:[email protected]] Sent: Wednesday, July 13, 2011 10:11 AMTo: [email protected] Subject: Erroneous nomenclature of the Assamese writing system as a subclass of Bengali in Unicode Consortium/Inc ToMs Magda DanishUnicode Consortium/IncUSA Subject : Erroneous nomenclature of the Assamese writing system as a subclass of Bengali in Unicode Consortium/Inc. Madam,
There has been considerable displeasure over the naming of writing system of the
Assamese language as a subclass of Bengali .
The fact needs to be cleared up regarding the nomenclature of the similar alphabets
used by the Assamese, Maithili, Bengali and Manipuri languages. This script is
actually the KAMRUPI script, it developed in the ancient kingdom of Kamrup, the
precursor or the older name of Assam. Kamrup had fixed boundaries from east to
west. In east it ended in present eastern border of India and in the west it extended up
to the river Korotoya now in the areas of so-called north Bengal. The indigenous
people of this area of so-called north Bengal still differentiates themselves from the
Bengalis and a movement for a separate state of Kamatapur spearheaded by extremist
organisations like KLO (Kamatapur Liberation Organisation) with close links with
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
extremist ULFA is there.
From time to time the ancient kingdom of Kamrup in pre-Muslim era used to make
conquering forays into adjacent areas of mainland India and had ruled many areas of
mainland India. Consequently whole of Bengal, eastern part of Bihar mainly the
Mithila area and Orissa were under rule of Kamrup for considerable period of time.
The KAMRUPI SCRIPT is used only in areas which were part of ancient kingdom of
Kamrup or were under the rule of Kamrup. So in Bihar this script is used only in
Mithila area which was once under Kamrup rule but not in areas of Bihar to the west
of it which were never under Kamrup rule. The similarities of these languages are
also due to this fact of the Kamrup rule in all these areas. Currently the Maithilis ie.
the indigenous people of Mithila use the Devnagari script for most purposes but they
still retain the use of the Kamrupi script for religious purposes. But there is a move by
several of the Maithili scholars to revive the Mithilakshar script, the name with which
this form of Kamrupi script is known there. The alphabets of the Assamese and the
Maithili versions are almost same and these scripts are phonetically complete.
Whereas the form used by the Bengali is phonetically lacking because the do not
have any alphabet to represent the sound wa. This is because the Bengali was using
the same script as is used by the Assamese till the coming of the British. Notable
example is the alphabet for the sound ra, they were using the same alphabet the
Assamese and the Maithili uses till the British period. Then they started using the
alphabet the Maithili uses for denoting wa for representing the sound ra and in
the process ended up having no alphabet for representing wa. The sound wa in
the present Bengali is represented on assumption by the alphabet used to denote the
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
sound ya.
The largest in number and oldest in time of the literatures in this system of writing
belong exclusively to the Assamese language. Maithili is the second oldest in terms
of time of writing and literary history of Bengali in the pre-British era comes much
later than the other two.
In the British period due to unscrupulous manipulation by a section of Bengali
scholars and intelligentsia, the British rulers were manipulated in believing that the
Assamese is just a peasant form or patois of the Bengali language. Thus started more
than seventy years of Bengali imposition as the official and educational language in
Assam. It was due to the struggle of the American Baptist Missionaries
complemented with the effort of the budding Assamese intelligentsia and right
thinking Bengali intellectuals who opposed their parochial compatriots, that
Assamese was given back it's rightful place in Assam.
There is now a move by a section of intellectuals to rename the writing system as
EASTERN NAGARI, this is far more erroneous because this script has nothing to
do with Nagari form of writing except a common source of borrowing of the schema
or concept but not the alphabets from the Brahmi script.
The script with which the Kamrupi script share the highest similarity is the Tibetan
script. The major similarities of the Kamrupi and the Tibetan system of writing is the
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
use of angular or triangular form or shapes in the alphabets. I have mapped out the
alphabets in the attached pdf file. The Kamrupi-Tibetan linkages which are quite
obvious even to a laymen have been overlooked and not thought of by other scholars
who had written on the subject. The Unicode has therefore wrongly written about the
Bengali(???)/Assamese script as being very similar with Devnagari. We are in the
process of making a website on the issue of wrong nomenclature of this writing
system, which will be in the form of a memorandum with provisions for obtaining
signatures online and send to the Unicode Consortium for rectification of the
injustice.
The proposal for the same has three options for the rectification :
FIRST : Give a separate slot to the Assamese writing system / fonts
SECOND : Rename the script as KAMRUPI
THIRD : Rename the script as AMBM (Assamese-Maithili-Bengali-Manipuri)
The third option is given on the basis of chronological basis of the use of this script in
the ancient and pre-British era.
Lets hope Unicode Consortium will take steps in the right direction, please inform us
whether it will be necessary us to go ahead with the proposed Website for the purpose
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
of rectifying the mistake/injustice or sending the required information or documents
by E-mail will suffice.
Dr Satyakam Phukan Jorpukhuripar, UzanbazarGuwahati, Assam (INDIA)P.I.N : 781001Phone : +91 99540 46357E-mail : [email protected] The links below will give more information first on the controversy and the second
on the roots and connections of the Assamese language.
http://rajivkonwar100.blogspot.com/2011/07/assamese-experts-question-sahitya-sabha.html
Roots and Strings of the Assamese Language
From: AZIZ-UL HAQUE [mailto:[email protected]] Sent: Thursday, July 21, 2011 8:25 AMTo: [email protected]; [email protected]: [email protected]: Assamese writing system in unicode ToMs. Magda DanishUnicode Consortium Inc. USA Dear Madam
Greetings from Guwahati, Assam, India. I am grieved to know that the Assamese
writing system has been kept as a sub-class of Bengali in Unicode.
In 1836 the British rulers imposed Bengali in Assam thinking that it was a patois or
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
colloquial dialect or distortion of Bengali. It took about 37 years of struggle which
was spearheaded by the American Baptist Missionaries led by Dr. Miles Bronson in
convincing the British administration that Assamese was a distinct language.
Assamese was finally reinstated in 1873.
The origin of Assamese script can be traced back to as early as 300 B.C., found in
inscription on stones during the reign of Ashoka the Great. Thus, it has a long history
and it developed through the ages. The ancient name of Assam was Kamrup and for a
considerable period its territory was extended to the Mithila area of Bihar, Orissa and
Bengal. There are sure proofs of distinct Kamrupi script which was written in the 8th
century. The people of those areas came under the influence of culture and language
of Kamrup. Moreover, there had been cordial relations of Kamrup with the
neighboring kingdoms. The people of those areas either used this ancient Assamese
script or borrowed the idea of this script. That is why there is a close affinty of
Assamese with Bengali, Maithili, Oria(prounounciation) and Manipuri. There are
many historical and documentary evidences to show that Assamese is a distinct
language from Bengali.
Therefore, Madam, we feel that a separate slot be given to Assamese or rename the
script as Kamrupi. A third option can be to rename the script as AMBM for
Assamese-Maithili-Bengali-Manipuri basing on the chronological development and
use of the script in the ancient times.
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Please let me know if I need to suffice with documentary evidences.
Hope you will look into the matter and do the needful.
With regards.Yours sincerely A. Haque Address: Aziz-ul Haque, Pastor, Guwahati Baptist Church, Panbazar, Guwahati-781001, Assam, India. Phone-09864023020.
DOCUMENT B .03
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
m m
N Ravi Shanker GOVERNMENT OF INDIA Additional Secretary ~ * q = r f f ~ ~ Email: [email protected] MINISTRY OF COMMUNICATIONS AND INFORMATION TECHNOLOGY
~ ~ k p r m
goT[h[/Tele: . . . DEPARTMENT OF INFORMATION TECHNOLOGY .:'\. .; . ,.. ..,dl
. . .'. . .
. ..
zqogom q0 : m f i c m Fax +91-11-24363099 '. D.O.NO .......................... ELECTRONICS NlKETAN
. .
. .. * 6,C.G.0. COMPLEX .. :
,, j D o No. 13(4)12011 -HCC(TDI.L):;. . .~ . ;:. t.4v:y.-$ , . , f?F8? /New Delhi-110003 .. . .
- ... .
". .+. , ;.>
-
DOCUMENT B.05
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT B.05
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
/BY SPEED POST
~~
GOVERNMENT OF INDIA
~ aih: ~ sthi) R'I cfi) +i =::j 1
-
Minutes of the Meeting on Unicode for Assamese Writing System
Department of Electronics & Information Technology, New Delhi
June is", 2012
Concerns were raised by Govt. of Assam regarding the nomenclature/ representation of the Assamese
Writing System in the Unicode Standard. A meeting of Assamese & Bengali experts and Officials ofGovt.
of Assam and West Bengal and some industry experts was organized on June 13th, 2012 at 11:00 AM at
Department of Electronics & Information Technology, Electronics Niketan, Lodhi Road, New Delhi to
address the issues raised. The list of the participants is placed at Annexure-I
Representatives of Government of Assam mentioned that there is erroneous nomenclature of the
Assamese writing system as a subclass of Bengali in the Unicode Standard. They submitted that the
script used for writing Assamese, Bengali, Maithili and Manipuri is "Kamrupi" as it was the writing
system used in ancient Kamrup state, whereas in the Unicode Standard it is mentioned as "Bengali". An
email dated 13-7-2011 was sent to Unicode Consortium by Shri Satyakam Phukan. In the mail Shri
Phukan had given his arguments for the name change and suggested three options:
First: Give a separate slot to the Assamese writing system / fonts
Second: Rename the script as KAMRUPI
Third: Rename the script as AMBM (Assamese-Maithili-Bengali-Manipuri)
Based on this e-rnail, Unicode Consortium included Assamese also in the Code-Chart list hosted on
Unicode website (http://www.unicode.org/charts/). "Bengali" was changed as "Bengali and Assamese"
on this web link.
Representatives of Government of Assam also requested to change the name of Bengali script as
"Assamese-Bengali" script based on alphabetical order so as to give due recognition to Assamese
Writing System also.
Assamese experts also requested to allocate a separate code block for existing Assamese writing system
and to cater to futuristic needs of various dialects of Assamese.
It was also discussed that name change request need to be examined by Unicode Consortium as
neighboring country Bangladesh is also using Bengali.
Experts from industry appraised the members that the internet security threat may arise with the
duplicate encoding of the same glyph with the implementation of Internationalized Domain Names
(IDN).
Based on the discussions following points were agreed:
- -- -1. The name change of the script from "Bengali"-fO "Bengali and Assamese'reflecfecfon the
website needs to be reflected in other parts of the text in the standard. The names of the
characters are indicated currently as per Bengali Script and additional annotation with
respect to Assamese can be taken up with Unicode for addition. Govt. of Assam may submit
the detailed proposals covering these aspects.
2. Government of Assam shall examine the futuristic need of additional requirements of
Assamese and its dialects and submit a report to DeitY alongwith the requisite documentary
support.
The meeting ended with the vote of thanks to the Chair.Page 1 of 2
DOCUMENT B.06
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Annexure-I
1. Dr. Rajendra Kumar, Joint Secretary, DeitY, New Delhi (In Chair)
2. Shri Shantanu Thakur, lAS, Commissioner, Excise Department, Govt of Assam
3. Shri M.K.Yadava, IFS, Managing Director, AMTRON
4. Mrs. Swaran Lata, DeitY, New Delhi
5. Prof. B B Chaudhury, 151, Kolkata; Govt. of West Bengal & SNLTR-West Bengal6. Shri Monoj Kr. Baruah, Dy. Manager, AMTRON
7. Prof. Lilabati Saikia Bora, Department of Assamese, Gauhati University
8. Dr. Sikhar Sarma, Professor & Head, Dept. Of IT, Gauhati University
9. Dr. Utpal Sharma, Associate Professor, Tezpur University
10. Shri Bl;tjlskat;;j.yotiSarma, lectl!U~J~,Department of Assarnese, Dibrugarh Universjtv11. Dr. Shakuntala Mahanta, HSSDept, liT Guwahati
12. Prof. Sivaji Bandyopadhyay, Jadhavpur University, Kolkata
13. Shri Debashis Mazumdar - Joint Director, CDAC, Kolkata
14. Shri Akshat Joshi, CDAC, Pune
15. Shri Vijay Kumar, DeitY, New Delhi
16. Shri Manoj K Jain, DeitY, New Delhi
DOCUMENT B.06
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
Reference numberISO 15919:2001(E)
ISO 2001
INTERNATIONALSTANDARD
ISO15919
First edition2001-10-01
Information and documentation Transliteration of Devanagari and relatedIndic scripts into Latin charactersInformation et documentation Translittration du Devanagari et descritures indiennes lies en caractres latins
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
PDF disclaimerThis PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall notbe edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading thisfile, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in thisarea.
Adobe is a trademark of Adobe Systems Incorporated.Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameterswere optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely eventthat a problem relating to it is found, please inform the Central Secretariat at the address given below.
ISO 2001All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronicor mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member bodyin the country of the requester.
ISO copyright officeCase postale 56 CH-1211 Geneva 20Tel. + 41 22 749 01 11Fax + 41 22 749 09 47E-mail [email protected] www.iso.ch
Printed in Switzerland
ii ISO 2001 All rights reservedDOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
ISO 2001 All rights reserved iii
Contents Page
1 Scope ..............................................................................................................................................................12 Conformance..................................................................................................................................................13 Normative references ....................................................................................................................................14 Terms and definitions ...................................................................................................................................25 Abbreviated terms .........................................................................................................................................36 Characteristics of Indic scripts ....................................................................................................................37 Transliteration tables ....................................................................................................................................48 Special requirements and recommendations...........................................................................................168.1 Special requirements ..................................................................................................................................168.2 Recommendations.......................................................................................................................................189 Options .........................................................................................................................................................1810 Tables for uniform transliteration of Indic scripts ...................................................................................1911 Transliteration scheme for limited character set .....................................................................................1912 Recommended transliteration of Indic schemes for Perso-Arabic characters.....................................1913 Additional Indic scripts ...............................................................................................................................1914 Reverse transliteration................................................................................................................................19Annex A (normative) Tables for uniform transliteration .......................................................................................20Annex B (normative) Transliteration table for limited (7-bit) character set ........................................................24Annex C (normative) Recommended transliteration of Indic schemes for Perso-Arabic characters..............25Annex D (informative) Examples of Indic characters used for Perso-Arabic .....................................................26Annex E (informative) Additional Indic scripts ......................................................................................................27Annex F (informative) Reverse transliteration of Indic scripts.............................................................................28F.1 Overview.......................................................................................................................................................28F.2 Examples of reverse transliteration in modern Indic languages............................................................28F.3 Reverse transliteration in Vedic texts .......................................................................................................28Bibliography ..............................................................................................................................................................29
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
iv ISO 2001 All rights reserved
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISOmember bodies). The work of preparing International Standards is normally carried out through ISO technicalcommittees. Each member body interested in a subject for which a technical committee has been established hasthe right to be represented on that committee. International organizations, governmental and non-governmental, inliaison with ISO, also take part in the work. ISO collaborates closely with the International ElectrotechnicalCommission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.
Draft International Standards adopted by the technical committees are circulated to the member bodies for voting.Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this International Standard may be the subject ofpatent rights. ISO shall not be held responsible for identifying any or all such patent rights.
International Standard ISO 15919 was prepared by Technical Committee ISO/TC 46, Information anddocumentation, Subcommittee SC 2, Conversion of written languages.
Annexes A, B and C form a normative part of this International Standard. Annexes D, E and F are for informationonly.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
ISO 2001 All rights reserved v
Introduction
Script conversion is often required for documents such as historical and literary texts, geographical texts (includingmaps and atlases), bibliographies, catalogues, lists and passports (and other identification documents).
Text in Devanagari script or other Indic scripts sometimes needs to be shown in Latin script, where users, orequipment that they are using, cannot read or write the text.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
INTERNATIONAL STANDARD ISO 15919:2001(E)
ISO 2001 All rights reserved 1
Information and documentation Transliteration of Devanagariand related Indic scripts into Latin characters
1 ScopeThis International Standard provides tables which enable the transliteration into Latin characters from text in Indicscripts which are largely specified in rows 09 to 0D of UCS (ISO/IEC 10646-1 and Unicode).The tables provide for the Devanagari, Bengali (including the characters used for writing Assamese), Gujarati,Gurmukhi, Kannada, Malayalam, Oriya, Sinhala, Tamil, and Telugu scripts which are used in India, Nepal,Bangladesh and Sri Lanka. The Devanagari, Bengali, Gujarati, Gurmukhi, and Oriya scripts are North Indianscripts, and the Kannada, Malayalam, Tamil, and Telugu scripts are South Indian scripts.
The Burmese, Khmer, Thai, Lao and Tibetan scripts which also share a common origin with the Indic scripts, andwhich are used predominantly in Myanmar, Cambodia, Thailand, Laos, Bhutan and the Tibetan AutonomousRegion within China, are not covered by this International Standard.
This International Standard applies to transliteration of Devanagari, and to Indic scripts related to Devanagari,independent of the period in which it is or was used (i.e. for Devanagari script it can be used for transliterating textin classical Sanskrit, Hindi, Marathi, and the Vedic language, for instance).Other Indic scripts whose character repertoires are covered by the tables may also be transliterated using thisInternational Standard.
Options in this International Standard are defined in clause 9.
2 ConformanceText originally in non-Latin script which is converted to a Latin-script representation conforms to this InternationalStandard with or without any of the specific recommendations, if it follows the rules defined in 8.1 and theconversion tables given in clause 7 and normative annexes A and B, with or without following any of the threerecommendations given in 8.2 and clause 12, all in accordance with the options defined in clause 9.
A claim of conformance shall specify which options have been chosen, and which recommendations have beenfollowed.
3 Normative referencesThe following normative documents contain provisions which, through reference in this text, constitute provisions ofthis International Standard. For dated references, subsequent amendments to, or revisions of, any of thesepublications do not apply. However, parties to agreements based on this International Standard are encouraged toinvestigate the possibility of applying the most recent editions of the normative documents indicated below. Forundated references, the latest edition of the normative document referred to applies. Members of ISO and IECmaintain registers of currently valid International Standards.
ISO/IEC 10646-1, Information technology Universal Multiple-Octet Coded Character Set (UCS) Part 1:Architecture and Basic Multilingual Plane
ISO/IEC 646:1991, Information technology ISO 7-bit coded character set for information interchange
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
2 ISO 2001 All rights reserved
4 Terms and definitions
For the purposes of this International Standard, the following terms and definitions apply.
4.1conversionrepresenting graphic characters from a source script by the graphic characters of a target script, most commonly byromanization
NOTE The two basic methods of conversion of a system of writing are transliteration and transcription. The use of theterms source script and target script in transliteration is analogous to the terms source language and target language intranslation.
4.2scriptset of graphic characters used for the written form of one or more languages
4.3graphic charactercharacter (other than a control character) that has a visual representation, normally handwritten, printed ordisplayed
NOTE A graphic character is a single element of a script. Examples are letters, conjunct characters, numerical digits,punctuation marks or diacritical marks.
4.4reverse transliterationprocess whereby the characters of a target script are transliterated into those of the source script
NOTE This International Standard aims to enable reverse-transliterated text to be identical to the original source text up toequivalent orthography. However, non-reversible transcription-like transliterations are often found to be useful when quotingrecent material.
4.5romanizationconversion of non-Latin graphic characters into Latin graphic characters, using either transliteration or transcription
4.6transcriptionrepresentation of the sounds of a source language by graphic characters associated with a target language
4.7transliterationrepresentation of the graphic characters of a source script by the graphic characters of a target script
NOTE In transcription, pronunciation conventions are of primary importance, while in transliteration, writing conventions areof primary importance.
4.8UCSUniversal Multiple-Octet Coded Character Set (UCS) as defined in ISO/IEC 10646-1NOTE 1 The Indic scripts listed in ISO/IEC 10646-1:1993 form a subset (with identical codes) of the Indic scripts listed inISO/IEC 10646-1:2000. Similarly, the Indic scripts listed in the Unicode standard (version 1.0 onwards) form a subset (withidentical codes) to the Indic scripts listed in ISO/IEC 10646-1:2000 and the Unicode standard, version 3.0. Any of thesestandards provide valid character codes for the specific characters concerned.
NOTE 2 ISO/IEC 10646-1 is increasingly used for providing character identifiers in a wide range of International Standards,including some in this International Standard. Use of these identifiers does not impose any requirements to use ISO/IEC 10646-1 orany other character coding standard to represent either the source characters or the target characters in any computer system orin information interchange.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
ISO 2001 All rights reserved 3
5 Abbreviated terms
Ben. Bengali script
Dev. Devanagari script
Guj. Gujarati script
Gur. Gurmukhi script
Kan. Kannada script
Mal. Malayalam script
Ori. Oriya script
Tam. Tamil script
Tel. Telugu script
Sin. Sinhala script
P-A. Perso-Arabic script
6 Characteristics of Indic scripts
Characters in Indic scripts represent vowels, consonants and their combinations; nasalization, breathings,numerals and punctuation.
Each vowel has a full form (occupying a full character space in text, and required when beginning a word or invowel hiatus) and a combining form (mtr) used when the vowel follows a consonant, except that the short astanding at the beginning of Indic alphabets has only a full form, because no mtr is required (see below).
Consonants include stops, semivowels, spirants, and other speech sounds. Stop consonants are arranged inclasses, or vargas, according to the point of articulation, and within each class are subdivided into unvoiced orvoiced, unaspirated or aspirated consonants, and a nasal consonant.
Characters for consonants are most simply quoted in a form which includes the inherent vowel a, as in the firstconsonant ka in Table 1. The inherent vowel is removed by the virma sign of the relevant script (Dev., Ben., Guj.,Gur., Ori. , Tam. , Tel. , Kan. , Mal. , Sin. . AThe relevant mtr is used when any other vowel
follows a consonant. Consonant clusters frequently form conjunct characters. Use of virma to form consonantclusters is unusual, except in Tamil where it is the normal method. When a mtr is associated with a consonant, itreplaces the inherent vowel. Mtrs have various forms, even in a single script, and details may be found indictionaries and grammars.
It is important to note that many Indic characters have variant forms. Such differences of orthography are notdistinguished in this International Standard.
Devanagari is used for writing various modern languages, such as Hindi, Marathi, Rajasthani and other languagesin India, and Nepali in Nepal. Devanagari and most of the other Indic scripts are used for writing classicallanguages often used in religious texts, such as the Sanskrit and Vedic languages, and Pali. In some cases, text inIndic scripts uses additional characters for writing words in languages which do not normally use these scripts.Thus some Urdu consonants are typically represented by adding a dot (nuqta) below certain letters (see Table 1,normative annex C and informative annex D). Two English vowels may also be represented. Devanagari has alsobeen extended to write South Indian languages.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
4 ISO 2001 All rights reserved
Sinhala script (used in Sri Lanka) has additional letters, in comparison with the scripts which are used in India,Nepal and Bangladesh. Tamil script (used in South India and also in Sri Lanka) uses fewer characters, incomparison with other scripts which are used in India, Nepal, Bangladesh and Sri Lanka.
When the Bengali script is used to write the Assamese language (in parts of North India), two characters not usedin writing Bengali are required. Hence the Assamese script is sometimes regarded as separate from the Bengaliscript.
7 Transliteration tables
7.1 The transliteration from each Indic script to the Latin script shall be as specified in the Tables 1 to 10 andA.3, subject to the rules specified in 8.1 and the options specified in clause 9.
7.2 The structure of the transliteration tables is explained in the following paragraphs.
The target characters (Latin script) fall within the ranges 0020-01FF and 0300-0332 of ISO/IEC 10646-1:2000.
The repertoires for many of the source characters fall within the following ranges of ISO/IEC 10646-1:2000, for thescript concerned:
0900-097F Devanagari
0980-09FF Bengali
0A00-0A7F Gurmukhi
0A80-0AFF Gujarati
0B00-0B7F Oriya
0B80-0BFF Tamil
0C00-0C7F Telugu
0C80-0CFF Kannada
0D00-0D7F Malayalam
0D80-0DFF Sinhala
Some additional Indic scripts whose character repertoires are included in the character repertoires of these scriptsare listed in informative annex E.
Consonants are shown with their inherent vowel a.
Only a single form of each Indic character is shown, just as in ISO/IEC 10646-1. Specifications of alternative formsof these characters, including shapes when these are included in conjunct forms or in consonant-vowelcombinations, are outside the scope of this International Standard.
This clause gives tables for each script, with references to the rules of 8.1. Numerals are shown in Table A.3 ofannex A. Tables 1 to 10 are in the order of ISO 10646-1:2000. Vowels are shown in full form followed by a typicalform of the corresponding mtr.
Normative annex A gives tables showing linguistically equivalent characters in each script (except that GurmukhiBindi is not exactly equivalent to anusvara in the other scripts). Extended and ancient characters, apart fromnumerals, are shown in Table A.2 unless an equivalent modern character exists in another script, in which casethey are enclosed in round brackets in Table A.1. (See also the requirements in clause 10.) In Tables A.1 to A.3the scripts are ordered according to similarity of character repertoires.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
ISO 2001 All rights reserved 5
A few rare characters for which attestation is not currently available are omitted.
Normative annex B gives the transliteration table (Table B.1) that shall be used when it is necessary to avoid use ofLatin letters with diacritics.
Normative annex C gives the recommended method of transliterating Indic characters specified as representingPerso-Arabic characters (Table C.1 and its rules of application).
In the Ref. column of all these tables, the 3-digit decimal references are derived from hexadecimal to decimalconversion of character codes in ISO/IEC 10646-1:2000. Note that the earlier International StandardISO/IEC 10646-1:1993 also includes these decimal codes explicitly in its tables, in case visual comparisons arerequired between this International Standard and ISO/IEC 10646-1.
3-digit decimal characters with an additional letter refer to characters not in ISO/IEC 10646-1:2000.
The order of characters in tables follows approximate alphabetical order, rather than the order inISO/IEC 10646-1:2000.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
6 ISO 2001 All rights reserved
Table 1 Transliteration of Devanagari script
Ref. Indic Transliteration Ref. Indic Transliteration Ref. Indic Transliteration
005 aaaa Rule 2 a 027 chachachacha 053 vavavava
006 028 jajajaja 054 aaaa
007 iiii 029
jhajhajhajha 055 aaaa
008
030 aaaa 056 sasasasa
009 uuuu 031 aaaa 057 hahahaha
010 032 hahahaha 088 qaqaqaqa
011
033 aaaa 089 aaaa
096 034 hahahaha 090 aaaa
012 !!!! "" "" 035 #### aaaa 091 $$$$ zazazaza
097 %%%% && && 036 '''' tatatata 092 (((( aaaa
015 )))) ** ** eeee 037 ++++ thathathatha 093 ,,,, hahahaha
016 ---- .. .. aiaiaiai 038 //// dadadada 094 0000 fafafafa
019 1111 **** oooo 039 2222 dhadhadhadha 048a 3333 c
020 4444 .... auauauau 040 5555 nananana 051 6666 aaaa
013 )7)7)7)7 77 77 b 042 8888 papapapa 002 99 99 !!!! Rules 3, 5, 8 a
017 7777 7777 b 043 :::: phaphaphapha 001 ;; ;; #### Rules 4, 5, 8 a
021 > khakhakhakha 045 ???? bhabhabhabha 003a X ''''023 @@@@ gagagaga 046 AAAA mamamama 003b ****
024 BBBB ghaghaghagha 047 CCCC yayayaya 061 DDDD Rule 15 a
025 EEEE -a-a-a-a 048 FFFF rararara
026 GGGG cacacaca 050 HHHH lalalalaNOTE 1 Additional characters from Extended Devanagari may be found in Table A.1. See also Table D.1.NOTE 2 The treatment of Vedic accents may be found in 8.1 (Rule 14 in clause 8), 8.2 and Table B.1.a See clause 8.b English vowels as in ba, bla, English bat, ball.c Used in Marathi and Nepali.
DOCUMENT C.01
Dr
Satya
kam
Phu
kan
Dr
Sa
tyaka
m P
huka
n
-
ISO 15919:2001(E)
ISO 2001 All rights reserved 7
Table 2 Transliteration of Bengali script
Ref. Indic Transliteration Ref. Indic Transliteration Ref. Indic Transliteration
133 aaaa Rule 2 a 154 cacacaca 174 mamamama
134 155 chachachacha 175 yayayaya
135 iiii 156
jajajaja 176 rararara
136
157 jhajhajhajha 240 rararara b
137 uuuu 158 aaaa 178 lalalala
138 159 aaaa 241 vavavava b
139
160 hahahaha 182 aaaa
224 161 aaaa 183 aaaa
140 162 hahahaha 184 sasasasa
225 163 aaaa 185 hahahaha
143 !!!! eeee 164 """" tatatata 220 #### aaaa
144 $$$$ %%%% aiaiaiai 165 &&&& thathathatha 221 #### hahahaha
147 '''' !!!! oooo 166 (((( dadadada 223 )))) 0a0a0a0a Rule 9 a
148 **** !!!!++++ auauauau 167 ,,,, dhadhadhadha 156a #### zazazaza c
149 ---- kakakaka 168 .... nananana 172a wawawawa c
150 //// khakhakha